Since 2012, the field of artificial intelligence (AI) has seen remarkable progress on a broad range of capabilities including object recognition, game playing, speech recognition, and machine translation, with an increasing number of academics and industry experts now classifying AI as a General Purpose Technology.
The rapid progress of AI has been achieved by increasingly large and computationally intensive deep learning models. Analyses from OpenAI found that training costs increase over time for state-of-the-art deep learning models. While we have seen an overall increase of 300,000x in AI’s computing capabilities since 2012, the training costs double every few months!
An even sharper trend can be observed in NLP word-embedding approaches by looking at ELMo, BERT, openGPT-2, XLNet, Megatron-LM, T5, and GPT-3. While these models have become increasingly accurate, these accuracy improvements come with an economic trade-off, as they depend on the availability of exceptionally large computational resources that necessitate substantial energy consumption. Hence, the costs are not just financial, but environmental as well. The table below gives estimated CO2 emissions from training common NLP models, compared to familiar consumptions. The numbers say it all:
Hence, not only are these models costly to train and develop, both financially, due to the cost of hardware and electricity or cloud compute time, but also environmentally, due to the carbon footprint required to fuel modern tensor processing hardware.
The outcome of these costs is that they make AI research and application prohibitively expensive, raising barriers to participation and adoption of AI.
AI Cost Centers
In short, the lack of clarity regarding AI costs and economics, comes down to confusing AI b-models with SaaS b-models. AI applications may look and feel like normal software as they are code-based, data-hungry, and have to interoperate with other tech stacks. But there is one area of difference that separates AI from SaaS b-models, and which is the source of this lack of economic focus – The intense focus on models.
The crux of AI applications are trained data models that interpret images, transcribe speech, generate natural language, and perform other complex tasks. Currently, when we hear most conversations about AI, there is an excessive focus on these models. The reason for this hyperfocus has more to do with culture rather than technical excellence – Walk into any AI firm or team and you’ll hear the excitement of getting “state-of-the-art” results which can be published on leaderboards, such as SuperGLUE. These leaderboards typically report accuracy (or other similar technical measures) but omit any mention of cost-efficiency. As a result, the economics of AI is sacrificed at the altar of performance and reputation gain.
But having a cost understanding of AI is key to AI adoption and research. And it ain’t cheap: Firstly, training these models can be very expensive. The Allen Institute for AI puts the average cost to train an AI model at $1 / 1000 parameters (based on model complexity, training data availability, existing libraries, and running costs). As the parameters increase, so does the cost:
- 110 Million parameters can range from $2.5K – $50K
- 340 Million parameters can range from $10K – $200K
- 1.5 Billion parameters can range from $80K – $1.6M
Eg: GPT-3 cost a few million to build & train+ $1B Supercomputer from Microsoft.
If training is expensive, so is the running cost. While training AI models can cost hundreds of thousands of dollars (or more) in compute resources, maintaining these models is not a one-time expenditure. As user adoption of the AI application increases, new data comes in and so do new consumer demands. As a result, the data that feeds the AI models tends to change over time leading to something known as “data drift“. The models thus need to be re-trained which equates to more computation costs.
This brings us to the third cost center – Cloud Computation. The type of data that AI applications compute is increasingly dense media data, like images, audio, or video. These types of data consume higher than usual storage resources, are expensive to process, (and often also suffer from region of interest issues, which means more costs) – An application may need to process a large file to find a small, relevant snippet. This means that model inference (the process of generating predictions in production), is also more computationally complex than operating traditional software, as executing a long series of matrix multiplications just requires more math than, for example, reading from a database.
Lastly, while the model is key to making the AI application effective, it is only one part of the equation. Myopic obsession with modeling ignores the reality that modeling is a small part of a large, expensive process… Data acquisition & preparation can take up to 50-80% of the AI team’s time. Experiment management and continuous analyses, are also computationally expensive and lead to heavy cloud infrastructure usage… Some AI firms spend up to 25% of revenue on cloud usage fees, which leads to lower gross margins (AI gross margins= 50-60% range </> SaaS business gross margins = 60-80%+ )
Finally, the evolution of the business solution will need trained experts in other domains to intervene at regular intervals (more hiring/contracting) and customer service costs as the consumer base grows. As a result, at times, an AI business can look more like a services business with customer management (human) teams trying the fill the chasm between the AI’s evolution and the growing needs expressed by end-users. Hence, a high CapEx comes with a growing OpEx as the AI application grows in range and solution spaces.
Solutions:
Solutions are emerging to these issues and can be categorized as Technical, Process-based, and Ecosystem resolutions:
- Technical Solutions: AI can be made more efficient by feeding on AI
- The high costs associated with large AI models are motivating researchers in the space to find more cost-effective alternatives. For example, 3 months after GPT-3’s release, a team of scientists at LMU Munich developed Pattern-Exploiting Training (PET), a Deep Learning training technique for NLP models, via which they trained a Transformer NLP model with 223M parameters, that out-performed the 175B-parameter GPT-3 by over 3 percentage points, effectively exceeding GPT-3 Performance with 99.9% fewer parameters.
- Efficiencies such as these are not just being seen in software, but also in hardware. For example, the emerging space of tinyML is fast gaining adopters who wish to address the cost issues of using AI. TinyML is the idea of running machine learning on microcontrollers and other embedded devices at less than 1 milliwatt. The video explains what is tinyML – by some of the industry pioneers of this technology (ARM + NVIDIA) and how it increases the use of AI and data in different areas, whilst respecting cost and environmental constraints.
[youtube https://www.youtube.com/watch?v=HQ5vabRWuVI]
- Finally, we are seeing a growth in technical solutions which can be categorized as AI for AI devs. Increasingly, tools such as DrRepair (automated bug detection and fixing), Kite (automated code completion) and NLP based solutions that can convert code from one programming language to another (eg: C++ to Java), are aiding developers with arduous time-consuming tasks, especially when building and re-building the model.
- New Processes to gain efficiency in AI dev:
The key takeaway here is that the way we develop AI is increasing becoming similar to the way we develop SaaS solutions. Agile, Scrum, Lean, etc; all emerged to ensure we can improve the efficiency (and lower the cost) of development. As AI’s use has increased, new tools and processes that are similar to the way we build SaaS products are now becoming increasingly mainstream.
- Floydhub is a cloud-based platform that provides AI devs with tools to increase workflow productivity.
- KubeFlow is the machine learning toolkit for Kubernetes. Devs who are used to running Kubernetes, should be able to run Kubeflow and build their AI models in the same way they build applications.
- In the same vien of microservices, we are increasingly seeing the “API-zation” of AI with the growth of low-cost SaaS offerings for AI development. Google’s AutoML allows developers to experiment & develop AI models with limited machine learning expertise to train high-quality models specific to their business needs. AI devs can now automate aspects of experimentation & model dev, feature engineering, tune or optimize model hyperparameters, or compress large DNNs for mobile or EDGE via API based solutions.
- Ecosystems Turbocharge AI Dev (at lower cost)
Just as Kaggle was able to dramatically lower the cost of data-based solutions whilst leveraging the benefits of cognitive diversity to address difficult prediction and complex-problem solving challenges, we are seeing the creation of robust AI ecosystem-led solutions that resonate the same approach, and are lowering the cost and entry barriers to AI:
- As the volume of data grows in tandem with the use of AI, so does the size and number of libraries and databases. Google’s Data Search provides Labeled-Data-as-a-Service, where users can discover datasets hosted in thousands of repositories across the Web at low prices. Free solutions such as SpaCy (V3.0) give NLP devs with an Open source Library for “Industrial Strength NLP”. SpaCy even comes with a workflow system (more process formalization). AI devs can now use tools such as CheckList for taxonomy generation (synonyms, word categories…) and fairness + behavioral testing applications. All these tools are made by the AI community, for the AI community and aid in the spread and use of AI.
- As seen in the cost centers, data preparation is a significant chunk of AI development. Today, AI devs can use solutions such as those offered by SCALE.AI which provides AI developers with high quality training and validation data for AI applications. This removes the burden of searching for appropriate data and addresses garbage-in-garbage-out scenarios. Labelbox is another popular tool that aids AI devs with training their models. They provide an end-to-end platform to create the right training data, manage the data and process all in one place, and support production pipelines with APIs.
Closing thoughts
Think about our apps today; if you are a firm that wants to lauch an e-commerce platform, chances are you are going to use an API like Stripe to process your payments. But while Stripe is an API-provider, they are also an API-consumer. Stripe works with Twilio, which provides the messaging and notification API to Stripe end users.
This {API x API x API } paradigm is the fundamental economic force behind the rise and scale of SaaS b-models. As any technology evolves, it fragments existing value chains, finds pockets of specialization, and builds b-models around them; a phenomenon that has been tracked, traced and proved by academics like Clayton Christensen and tech observers like Kevin Kelly.
We’re seeing the same thing today with AI, and the motivation is not just good tech, but smart money in equal parts. As the prohibitions and environmental costs of AI become clear, Leaderboards are now emerging that are focused on environmental efficiency of AI.
As the volume of data continues to grow in step with demands for more automation and application of AI, new technical solutions, efficient processes, and collaborative ecosystems, will be the cornerstone of robust AI-led innovation and help us treat the planet with respect.
Read our latest report about how AI can power your climate action strategy here.
Authors
Kary Bheemaiah CTIO Capgemini Invent | Mark Esposito Chief Learning Officer Nexus FrontierTech |