Jurassic-1 Language Models
Jurassic-1 (J1) is the first generation in a series of large language models trained and made widely accessible by AI21 Labs.
AI21 Studio provides access to general purpose J1 models with up to 178B parameters, and allows users to train custom J1 models for a specific task, offering improved performance and scalability. If you want to grow your app beyond a proof-of-concept and efficiently serve production-scale traffic, setting up a custom model is a good idea.
To learn more, read our case study blog post, where we address a specific language task, initially using general-purpose models and then using a production-grade custom model.
General Purpose Models
There are two versions of general purpose Jurassic-1 models available in the open beta, differing by size: J1-Jumbo, with 178B parameters, is the larger and more capable of the two models; J1-Large, with 7.5B parameters, is smaller and faster but overall less capable, though still very effective for many use-cases.
Both models share the same Transformer-based architecture with a 2048-token context length and a 256K-item vocabulary. By employing a larger vocabulary, roughly 50% more text can be fit in the same length of context compared to other LMs. That allows for longer prompts and therefore higher accuracy or, alternatively, fewer tokens for the same input, leading to faster inference. Moreover, while most vocabulary lists used by other models are typically limited to just words and word pieces, J1's vocabulary also includes multi-word expressions, such as "Golden State Warriors" and "Barack Obama".
A complete description of Jurassic-1, including benchmarks and quantitative comparisons with other models, can be found in our technical paper.
The general purpose models can be applied to virtually any language task by crafting a suitable prompt, containing a description of the task and/or a few examples, a process commonly known as “prompt engineering”. If you’re looking for inspiration, you can find example use-cases implemented with prompt engineering in our blog post. With trial and error, you should be able to bootstrap a prompt that yields good performance for your use-case. However, to achieve even better performance and scale-up your app, we recommend that you transition to a custom model.
Advantages of custom models
AI21 Studio allows you to train and use your own custom J1 models. Custom models are tuned for optimal performance on a training set of examples representing a specific task. This offers numerous advantages compared to engineering a prompt for a general purpose model:
Given a sufficient number of training examples, custom J1 models match or exceed the accuracy of J1-Jumbo with prompt engineering. For many use-cases, you can expect custom models to begin outperforming prompt engineering with as few as 50-100 examples.
Custom models are lightweight and can be efficiently deployed to serve large volumes of traffic.
Furthermore, since the training process bakes the task-specific behavior into the custom model, there’s no need to waste time on processing the same prompt tokens in every request.
As a consequence of the last two points, custom models promise lower latency, with 1.5-3x speedup compared to J1-Jumbo for most tasks (and even better speedup for some tasks).
Custom models derive their quality from the training data you provide; adding more, higher quality examples will improve results. This means you can continuously refine your custom model by curating high quality data for your task.
Custom models can be trained to perform virtually any language task. A good (but not airtight) rule of thumb is that a custom model can be applied to a task if J1-Jumbo performs it well with an engineered prompt containing a few examples, and performance improves as more examples are added.
Getting your own custom model and using it
To obtain access to your own custom models, follow these simple steps:
Submit an application form. Include the data you have available for training the model. After you submit the form, you will receive an email acknowledging submission and we will review the information you provided.
Once your application is approved, you will be notified via email and we will start training a custom model for you. At this point, we may contact you again asking for additional information or training data.
Finally, when your custom model is trained, you will receive an email notification with instructions on how to use it.
You can train and use multiple custom models. To train a new custom model, simply repeat the steps above.
All your custom models will appear in the list of available models in the account page and will be available in the model dropdown in the playground. To use your custom model in an API call, post a
complete request to the correct URL corresponding to the model’s name as it appears in the account page.
Note: Custom models are private. Only you will be able to send requests to your custom models - other AI21 Studio users don’t have access to them.
Custom models performance depends on the amount and the quality of training data you provide for training. As a starting point for most tasks, we recommend using 50-100 examples. If you have more data, that’s even better!
If you don’t have enough pre-existing training data, you can use J1 general purpose models to efficiently generate more data. Assuming you’ve already created a prompt with just a few examples that works well for your task, you can leverage it in one of two ways:
If you have examples of inputs that represent your use-case, feed them to J1 with your prompt, and collect the outputs generated by the model. The pairs of inputs and generated outputs will be your training set. Note that you can often collect relevant input examples relatively easily, either from public sources on the web or from your own data.
If you don’t have access to relevant input examples, you can let J1 generate both the inputs and the outputs. Feed J1 with a sequence of examples (input #1, output #1, input #2, output #2 and so on) and let it generate more examples. This tends to work well for short inputs, up to a few sentences long, but may result in a higher rate of bad examples for longer inputs (e.g. whole articles).
The process above is illustrated step-by-step in our blog post. We recommend you review and validate the generated content (or at least a sample from it) before training a custom model, to make sure the data properly captures the desired behavior. Watch out for incorrect, corrupted or toxic generations, as including these in the training data will negatively affect the resulting custom model. If you find such bad examples, amend them manually or simply exclude them from the training data.