RE: LeoThread 2024-11-03 06:11 — Hive

You are viewing a single comment's thread from:

RE: LeoThread 2024-11-03 06:11

taskmaster4450le (80)in LeoFinance • 3 months ago

What Are Hallucinations?

Large language models can generate responses that seem logical or coherent but contain incorrect or inconsistent information. We refer to this phenomenon as a hallucination.

For example, a model might say something like, ‘Marseille is the capital of France.’ While this statement is false, it could sound perfectly plausible without checking with an external truth source.

For example, in response to a question about the health benefits of particular foods, the model would likely consult an internet source and communicate what it has learned. However, not every piece of online information is true or relevant. Our model could quickly obtain the wrong sources and give bad advice.

Another cause of such errors is that LLMs can misrepresent the context in which a prompt is presented. This can lead to a response that is contextually inappropriate or inaccurate.

3 months ago in LeoFinance by taskmaster4450le (80)

$0.00

Sort:

taskmaster4450le (80) 3 months ago

Causes of Hallucinations in Large Language Models
We’ll review the main factors contributing to this issue. These include

training data quality
temporal limitations of data
the probabilistic nature of large language models
a lack of real-world understanding
ambiguities and complex prompts
overfitting

$0.00

taskmaster4450le (80) 3 months ago

Overfitting happens when we train a machine learning model too much tuned to the training set. As a result, the model learns the training data too well, but it can’t generate good predictions for unseen data. An overfitted model produces low accuracy results for data points unseen in training, hence, leads to non-optimal decisions.

$0.00

taskmaster4450le (80) 3 months ago

A model unable to produce sensible results on new data is also called “not able to generalize.” In this case, the model is too complex, and the patterns existing in the dataset are not well represented. Such a model with high variance overfits.

Overfitting models produce good predictions for data points in the training set but perform poorly on new samples.

$0.00

taskmaster4450le (80) 3 months ago

Underfitting occurs when the machine learning model is not well-tuned to the training set. The resulting model is not capturing the relationship between input and output well enough. Therefore, it doesn’t produce accurate predictions, even for the training dataset. Resultingly, an underfitted model generates poor results that lead to high-error decisions, like an overfitted model.

An underfitted model is not complex enough to recognize the patterns in the dataset. Usually, it has a high bias towards one output value. This is because it considers the variations of the input data as noise and generates similar outputs regardless of the given input.

When training a model, we want it to fit well to the training data. Still, we want it to generalize and generate accurate predictions for unseen data, as well. As a result, we don’t want the resulting model to be on any extreme.

$0.00

taskmaster4450le (80) 3 months ago

Let’s consider we have a dataset residing on an S-shaped curve such as a logarithmic curve. Fitting a high-order parabola passing through the known points with zero error is always possible. On the other hand, we can fit a straight line with a high error rate.

The first solution generates an overly complex model and models the implicit noise as well as the dataset. As a result, we can expect a high error for a new data point on the original S-shaped curve.

Conversely, the second model is far too simple to capture the relationship between the input and output. Hence, it will perform poorly on new data, too:

$0.00

taskmaster4450le (80) 3 months ago

Cures for Underfitting
To prevent underfitting, we need to ensure the model complexity.

The first method that comes to mind is to obtain more training data. However, this is not an easy task for most problems. In such cases, we can bring data augmentation into service. So, we can increase the amount of data available by creating slightly modified synthetic copies of the data points at hand.

Similarly, increasing the number of passes on the training data is a viable approach for iterative algorithms. Increasing the number of epochs in a neural network is a well-known practice to ensure model fitting.

$0.00

taskmaster4450le (80) 3 months ago

Another way to increase model complexity is to increase the size and number of model parameters. We can introduce engineered features from the dataset. For example, a product of numerical features or n parameter of n-grams generates new features.

Alternatively, we can reduce regularization. Some implementations implicitly include default regularization parameters to overfitting. Checking the default parameters is a good start point. As we’re trying to get out of a limited feature set, there’s no need to introduce limiting terms into the model.

Replacing the approach is another solution. For example, the selection of the kernel function in SVM determines the model complexity. Thus, the choice of kernel function might lead to overfitting or underfitting.

$0.00

taskmaster4450le (80) 3 months ago

Let’s summarize what we’ve discussed so far in a comparison table:

Overfitting	Underfitting
Model is too complex	Model is not complex enough
Accurate for training set	Not accurate for training set
Not accurate for validation set	Not accurate for validation set
Need to reduce complexity	Need to increase complexity
Reduce number of features	Increase number of features
Apply regularization	Reduce regularization
Reduce training	Increase training
Add training examples	Add training examples

$0.00