Development of Sensory Testing
-
Upload
emmae-thaleen -
Category
Documents
-
view
230 -
download
0
Transcript of Development of Sensory Testing
1.0 DEVELOPMENT OF SENSORY TESTING IN FOOD INDUSTRY
1.1 Introduction:
Sensory tests have been conducted as human beings began to
evaluate the quality in their surroundings. Sensory analysis began during the
wartime when there are some efforts to provide food to the soldiers. There
will be a worth value for sensory testing as it helps to determine its
acceptability in marketplace. The three principal uses for sensory techniques
are quality control, product development, and research. To conduct valid and
reliable tests that provide data is the primary function of sensory testing.
Basically, there are many kinds of tests which can be classified into
two major tests. They are analytical tests and affective tests. Analytical test
can be divided into overall difference test, various attribute difference tests
and also descriptive tests. The affective tests are based on consumer testing.
There are many types of Difference Test. A Triangle Test is a sensory
test that is used to determine difference between products. These
differences could be ingredients, processing, or differences in packaging.
Effective testing includes presenting three samples and asking which sample
is different. In any type of test, leaving room for panelists to make comments
is also beneficial because it can sometimes better explain their choices. A
Two-out-of-Five test is similar to the Triangle Test. Panelists are asked to pick
two out of the five that are similar in characteristics.
Multiple Paired Comparison Tests, where panelists are asked to taste
two samples and rate attributes such as saltiness. The panelists may be
asked to mark the sample that is the most or least salty. This test involves a
number sample pairs.
In a ranking test, panelists are asked to rank in order an attribute the
sample possesses (or lack of.) Ranking samples of apples on levels of
crispiness (most crisp to mushy) is an example of a ranking test. Ranking the
color brown on various types of French fries after being deep fat fried (using
1
different types of potatoes may cause intensity changes to occur in the
browning of the potato) is another example of a Difference Ranking Test.
Descriptive analysis methods involve the detection (discrimination)
and the description of both the qualitative and quantitative sensory aspects
of a product by trained panels of 5 to 100 judges (subjects). It is a method by
which attributes of a food or product are identified and quantified using
human subjects who have been trained for this purpose. It is an appropriate
for use when detailed information is required on individual characteristics of
the product or material or both. It involves the detection and description of
both qualitative and quantitative sensory aspects of a product by trained
panelists. Descriptive test can provided information that cannot be obtained
by other analytical means. The analysis can include all parameters of the
products or it can be limited to certain aspects.
Smaller panels of five to ten subjects are used for the typical product
on the grocery shelf, whereas, the larger panels are used for product of mass
production where small differences can be very important, example like,
beers and soft drinks. Panelists must be able to detect and describe the
perceived sensory attributes of a sample. Plus, panelists must learn to
differentiate and rate the quantitative or intensity aspects of a sample and
also must learn to define to what degree each characteristic or qualitative
note is present in that sample. Panelists must be screened and quantified to
participate and must maintain their skills.
The qualitative aspects of a product combine to define the product and
include all of the appearance, aroma, flavor, texture, or sound properties of a
product that differentiate it from others. The goal of descriptive analysis is to
provide a quantitative specification of the important sensory aspects of a
product. Use descriptive tests to obtain detailed description of the aroma,
flavor, and oral texture of foods and beverages, skin-feel of personal care
products, hand-feel of fabrics and paper products, and the appearance and
sound of any product.
2
Qualitative factors include terms that define the sensory profile or
picture of the sample. There are three type of scales used which are
category scales, line scales and magnitude estimation. The order of
appearance of physical properties, related to oral, skin and fabric textures,
are generally predetermined by the way the product is handled (the input of
forces by the panelist). In addition to the detection and description of the
qualitative, quantitative and time factors that define the sensory
characteristics of a product, panelists are capable of, and management is
often interested in, some integrated assessment of the product properties.
The overall impression includes total intensity of aroma or flavor; balance or
blend (amplitude) of the aroma; overall difference of the sample; and
hedonic ratings.
3
2.0 OVERALL DIFFERENCE TESTS
2.1 Triangle test
The objective of this test (Triangle test) is to determine whether a
sensory difference exists between two products. This method also important
to determine whether there are changes in products after treatments upon
the products had been done where the product changes that produced
unable to be characterized simply by only one or two attributes. Statistically
it shown that this method are more efficient compare to the paired
comparison and duo-trio methods but triangle test hassle limited use with
products which means that it has limited use with products that involved
sensory fatigue, carryover or adaptation. It also has limited use with subject
that have problem or confuse in testing three samples.
Although triangle test has limited use but it is effective in certain
situations for instance first is to determine the products differences occur
from the changes in ingredients, processing, packaging, or storage.
Secondly, is to determine whether an overall difference exists, where there is
no specific attributes that can be identified as having affected. Thirdly, it is
effective in order to select and monitor panelists for their ability in
discriminating given differences.
In this test, each panelist presented with three coded samples. Two
from the three samples are identical and one is different (odd). Panels need
to taste or feel or examine each product in order from left to the right. Panels
need to identify the odd samples. The number of panels that identify
correctly will be count and to interpret the data we must refer a table.
Basically, 20- 40 panels needed to undergo triangle test. In certain
situation which the differences are large and easy to identify only a few
panels as 12 panels can be used. On the other hand, for the similarity test it
requires 50- 100 panels. Panels must be familiar with triangle test format ,
4
the procedure and familiar with the product that been tested. It is because of
flavour memory is important in this test. An orientation session need are
recommended before panels undergo this test and care must be taken in
giving information in order to be more instructive and motivating. Care is
important in order to avoid bias among panels.
In this test, there same things that need to be control which are the
test area and the preparation of the samples. The lighting at the taste area
must be controlled in order to reduce any colour variables. Meanwhile, the
samples should be prepared under optimum condition according to the
product type that used in this test. Question that related to acceptance,
preference, degree of difference or type of difference after initial selection of
the odd sample should not be asked because it can cause bias the responses
of the panels.
2.2 Two-out-of-Five Test
Two- out- of- Five Test is a test or method that statistically very
efficient compared to triangle test. it is because of this method give high
chances of guessing correctly of the samples which is 1 in 10 samples
compare to 1 in 3 samples (triangle test). This test is affected by sensory
fatigue and memory effect which this two factor are the principal used in
visual, auditory and tactile applications. This principal is not used in flavor
testing.
This method used when the objective of a test is to determine whether
a sensory difference exists between two samples and this method also used
when the numbers of subject are small (ten person).
Two- out- of- Five Test effective only in certain situations which in
situations that need to determine whether the difference of the product was
cause by the ingredients, processing, packaging and storage. Other situation
which the use of this method is effective is the situation that we need to
5
determine whether there is overall difference exists, where there is no
specific attributes that can be identified as having been affected and it also
effective in selecting and monitoring panelist for their ability to discriminate
differences that given in test situations.
In this method, panels were presented with five samples. Two of the
five samples belong to one type. Meanwhile, the other three were belong to
another type. The samples were tasted, view, examined and feel in order
from left to right. Panelist need to identify two samples which these two
samples are different from the other three samples. To undergo this test,
trained panelists were needed. Basically, 10- 20 panels were used. When
the differences are large and easy to identified, we can use 5 to 6 panels
only.
2.3 Same/ Different Test (or simple Difference Test)
Same/ different test also known as simple difference test. This method
used when the objective of the test is to determine whether a sensory
difference exists between two products. Generally, this method also used
when a test is not suitable for triple or multiple presentation which means
that not suitable for triangle test and duo-trio test. Examples of situation that
unsuitable for triangle test and duo- trio test are comparisons between
samples of strong or lingering flavour, samples that need to be applied to the
skin in half- face tests and samples that very complex which cause mentally
confusing to the panelist.
As the other test, same/ different Test also effective in certain
situation. For examples, this method is effective in situations that need to
determine whether product difference is cause by the change in ingredients,
processing, packaging, or storage and the other situation is the situation that
we need to determine whether there is an overall difference that exists
where there is no specific attributes that can be identified as having been
affected. This type of test consume more time compare to the other test
6
because the differences between products were obtained by comparing the
responses which the responses were obtained from different pairs (A/A, B/B,
A/B and B/A).
In this method each panels will be presented with 2 samples which
panels need to identified either the samples were same or different. Half of
these samples will present 2 different samples. Meanwhile, the other half will
present the same samples which it will present twice. Basically, there will be
20- 50 presentations of each of the four samples combinations (A/A, B/B, A/B
and B/A) required to determine differences. More than 200 panels can be
used in this method or 100 panels will receive two of the pairs. In a situation
where the same/ different test had been use because of the complexity of
the stimuli, subject should not be present with more than one pair of
samples at a time. Panels that involved in this test can be trained panels or
untrained panels but the subjects or panels that involved in this test cannot
be the mixture of trained and untrained panels. The results of this method
was analyze by comparing the responses for the different pairs by using the
x² - test.
2.4 “A”_ “Not A” Test
As the same/ different test, this test was used when the test objective
is to determine whether a sensory differences exists between two products
and generally it also used when test are not suitable for dual or triple
presentation (triangle test and duo- trio test not suitable). Examples of
situation that unsuitable for triangle test and duo- trio test are comparisons
between samples of strong or lingering flavour, samples that need to be
applied to the skin in half- face tests and samples that very complex which
cause mentally confusing to the panelist.
“A”_ “not A” test was used in preference to the same/ different test
where it was used when one of the two products has it’s own importance as
a standard or reference product which the panels were familiar with the
7
subjects or one of the two products is essential to a project similar to the
current sample against which all others are measured. As the others tests,
“A”_ “not A” test also effective in certain situation where this “A”_ “not A”
test effective in situations which is exactly the same with the situations
which the same/ different test was effective. This test is very useful for
screening of panelists. Other than that, it also can be used for determining
sensory threshold by Signal Detection Method. The principle of this test is to
familiarize the panelists with samples “A” and not A”. Each panelists will be
presented with samples where some of the samples are product “A” while
the other product are “not A”. Panels should identified whether the samples
that been presented to them is “A” or “not A”. x² - test was used to compare
the correct identifications with the incorrect ones in order to determine the
subject’s ability.
In this test, to recognize the “A” and “not A” 10- 50 trained panels
were needed. 20- 50 presentation of each sample in the study. Each panel
may receive only one sample either “A” or “not A”, 2 samples (one “A” and
one “not A”) or panels may test more than 10 samples in a series. Number of
samples that allowed to be presented to subjects is determined by the
degree of physical or mental fatigue that produced by the samples in each
panels. In this test, the standard version of the procedure a set of protocol
must be observed. The set of protocol are products “A” and “not A” must be
available to panels only until the start of the test, only one “not A” sample
exists for each test and equal numbers of “A” and “not A” must be
presented in each test. This set of protocol may be changed for any given
test. The changes must be informed to the panels.
2.5 Duo-trio test
Application and importance:
The duo-trio test (ISO 2004a) is known to be statistically less efficient
than the triangle test as the chance of obtaining a correct result by guessing
8
is 1 in 2. However, this test is simple and easily to be understood. The
advantage of this test is that a reference sample is presented and this can
avoid confusions with respect to what constitutes a difference. On the other
hand, the disadvantage of this test does exist, whereby instead of only two
samples, three samples must be tasted. This test method will be used only
when test objective is to determine whether a sensory difference exist
between two samples. It is particularly useful in determining if there is a
product differences resulting from change in ingredients, food processing
and packaging, or storage. It is also used to determine whether an overall
difference exists, where no specific attributes can be identified as having
been affected.
The duo-trio test uses its general application when there are more than
15, and preferably more than 30, test subjects are available. The test exists
in two types, which are the constant reference mode and the balanced
reference mode. The constant reference mode will be used in which the
sample, usually drawn from regular production, is always the reference while
the balanced reference mode is where t both the samples being compared
are used at random as the reference. Use the constant reference mode with
trained subjects whenever a product is well known to them can be used as
the reference. The balanced reference mode is used when both samples are
unknown or if untrained subjects are being used. The duo-trio test will be
less suitable than the paired comparison test if there are pronounced
aftertastes.
Principle of the Test:
An identified reference sample will be presented to the subject
followed by two other coded samples, one which matches the reference
sample. The subject needs to indicate which coded sample matches the
reference. The correct number of replies will be counted and interpretation is
referred to the table of Critical Number of Correct Responses.
9
Test Subjects:
The minimum subjects for this test is 16, but for less than 28 subjects,
the beta-error is high. Discrimination can be improved if 32, 40 or larger
number can be applied. At a minimum, subjects need to be familiarized with
the product characteristics and the test procedure. Subjects will not be
informed about specific information about the samples to avoid bias.
Method:
Control of lighting may be necessary to reduce colour variables and
samples need to be prepared and presented under optimum conditions for
the product being inspected. Samples need to be offered simultaneously, if
possible, or else sequentially. The samples need to be prepared in equal
numbers of the possible combinations and allocate the sets at random
among the subjects. Score sheet (which is the same as in the balanced
reference and constant reference modes) will be provided and space for
several duo-trio tests may be provided on the score sheet. The number of
correct responses and the total number of responses will be referred to the
table of Critical Number of Correct Responses. It will not count for “no
difference” responses and subjects need to guess when in doubt.
Usage:
As an example, this test will be used in a case where a food
manufacturer (i.e.; Chocolate blend) needs to replace the current ingredient
of cocoa beans used to make their food product. So, food analyst will try to
determine which type of cocoa beans can be best replaced the current
blend. They will then test for similarity between the current blend and each
type of the project blend. This is to see whether there is any significant
difference or similarity between the original blend and the substituting blend.
2.6 Difference-from-control test:
10
Application and importance:
This test will be used when the project or the test objective is twofold,
where at the same time needs to determine whether a difference exist
between one or more samples and a control, and when estimating the size of
any such differences. One sample will be designated as the ”control”,
“reference”, or “standard”. All other samples are evaluated with respect to
how different each sample is from the controlling sample. It is useful in
situations where the difference may be detectable; however the size of
difference affects the decision about the test objective. This test is
appropriate when the duo-test and triangle test cannot be used because of
normal heterogeneity of food products. It can also be used as a two-sample
test in situations where the multiple sample tests are inappropriate because
of fatigue and carryover effects.
Principle of the Test:
The subject will be presented a controlling sample plus one more test
sample. The size of the difference between each sample and the control will
be rated by the subject and a scale is provided for this purpose. Indicate the
subjects that some of the test samples may be the same as the control. The
resulting mean difference-from-control is evaluated and estimated by
comparing them to the difference-from-control obtained with the blind
controls. The estimation obtained from the blind controls is used to obtain a
measure of the placebo effect.
Test subjects:
There are generally 20-50 presentations of each of the samples and
the blind control with the labelled control are required to determine a degree
of difference. When the difference-from-control test is chosen because of a
complex comparison or fatigue factor, then no more than one pair of
11
samples should be given to the subjects at the same time. This test can
either use trained or untrained panellist, but should not consist mixture of
both. The subjects need to be familiarized with the test format, the meaning
of the scale and the fact that the proportion of test samples will be blind
controls.
Method:
The test controls and product controls for this test is the same as the
triangle test and the duo-trio test. The samples will be presented
simultaneously, if possible, with the labelled control evaluated first. One
labelled control sample will be prepared and the other test sample will be
known as the sample test. When a sample being conducted to all subjects
but the sample testing cannot be done in that one test session, they need to
keep a record of subjects by sample to ensure that the remaining samples
are presented in subsequent sessions.
Usage:
As an example, the test is used in measuring the perceived difference
within batches of food, such in the case of flavoured peanut snack. They will
develop a test method suitable for monitoring batch-to-batch variations in
the production of the flavoured peanut snacks (i.e.: spicy flavour and
barbeque flavour). In such difference test as this, subjects need to detect
batch-to-batch differences and allows separation of the variations of the
flavoured peanut snacks.
2.7 Sequential test
Application and importance:
12
This test were meant to economize the number of evaluations required
to draw a conclusion, for example, acceptance vs. rejection of a trainee on a
panel or shipment vs. destruction of a lot of produced goods. Because alpha
and beta error were determined and decided beforehand, the sequential
tests provides a direct approach to simultaneously test for either the
difference or the similarity between the two samples.
It is very practical and efficient as they take into consideration the
possibility that the evidence derived from the first few evaluations can be
sufficient to provide a conclusion. Further testing can be a waste of time and
money. Due to this test, it reduces the number of evaluations as much as
50%. It may be used with existence-of-difference test in which there is a
correct and incorrect answer.
Principle:
A sequence of evaluation was conducted according to the procedure
appropriate for the chosen method and the results will be entered into a test
graph. Three results are identified as the acceptance region, the rejection
region, and the continue-testing region. The number of trials will be plotted
on the horizontal (x) axis while the total of correct responses is plotted on
the vertical (y) axis. Result of the first test will be entered and each
succeeding test, increase x by 1 and y by 1 for a correct reply and 0 for
incorrect reply. The test will be continued until a point touch or crosses one
of the lines bordering the region of indecision. Indication of the conclusion
will be drawn in the graph.
Usage:
Sequential test can show a significant test plot that is capable to draw
a conclusion by plotting the results in graph. This test can be conducted in
cases such as in the sequential Duo-Trio test: The Warmed-Over Flavour in
Beef Patties. This example case shows that they need to determine whether
13
difference can be detected for the samples stored for a day, 3 days and five
days vs. a freshly grilled patties. The preliminary test shows that in the duo-
trio test, 5-days patties shows a strong ‘warmed-over-flavour’ and 1-day
patties have none, hence the sequential test design were appropriate;
whereby the decision for these two samples could occur just with a few
responses based on the graph plot. As each subject completes one test, the
result is added to the previous responses, and the cumulative results are
plotted. The test series continues until the storage sample is declared similar
to or different from the control.
3.0 ATTRIBUTE DIFFERENCE TESTS
Attribute difference tests measure a single attribute such as sweetness,
comparing one sample with one or several others. The lack of a difference
between samples with regard to one attribute does not simply that no overall
difference exists. Attribute difference tests involving two samples are simple
regarding test design and statiscal treatment. Determining whether test
situations are one-sided or two-sided is the main difficulty to determine.
14
Some designs can be analyzed by the analysis of variance whereas others
require specialized statistics if we get more than two samples. The degree of
complexity increases rapidly with sample numbers, as does the economy of
testing, which is possible by improved test designs.
In these attribute test, we will explain about a description of the various
multiple pair test follows, multisample tests and their designs.
3.1 Directional Difference Test: Comparing Two Samples
DEFINATION the method is also called the paired comparison test or the 2-
AFC (2-alternative forced choice) test. It is one of the simplest and most used
sensory tests that is often used first to determine if other more sophiscated
tests should be applied.
PURPOSE/USAGE this method when the test objective is to determine in
which way particular sensory characteristic differs between two samples.
APPLICATION,TOOLS AND TECHNIQUE INVOLVED
The number of respondents required for the test is affected by :
1) Whether the test is one-sided or two-sided
2) The values chosen for the test-sensitivity parameters.
This test present to each subject two coded samples. Prepare equal
numbers of the combinations of AB and BA and allot them at random among
the subjects. The subject will be ask to taste the products from left to right
and fill in the scoresheet. Clearly inform the subject whether ‘’no difference’’
verdicts are permitted.
Only the ‘’forced choice technique’’ is amenable to formal statiscal
analysis. However, in some cases subjects may object quite strenuously to
inventing a difference when none is perceived. The sensory analyst must
15
then decide whether to divide their scores evenly over the two samples or
ignore them.
This test procedure Prepare equal numbers of the combinations AB
and BA and allocate the sets at random among the subjects. The scoresheet
is the same whether the test is one- or two sided, but the scoresheet must
show whether ‘’no difference’’ verdicts are permitted. Space for the several
successive paired comparisons may be provided on single scoresheet,but do
not add supplemental questions because these may introduce bias.
For the count the number of responses of interest where In a one-sided test,
count the number of the correct responses, or the responses in the direction
of the interest. In two sided test,count the number of agreeing responses
citing one sample more frequently.
IMPLICATION AND IMPORTANCE
The test is conducted with subjects who have received a minimum of
training, it is sufficient that subjects are completely familiar with the
attribute under test. Some test is particular important such as an off-flavor in
aproduct already on market, highly trained subjects may be selected who
have shown special acuity attribute.This is because the chance of guessing is
50%, fairly large numbers of the test subjects are required.
3.2 Pairwise Ranking Test: Friedman Analysis
Comparing Several Samples in All Possible Pairs
PURPOSE/USAGE
This method is used when test objective is to compare several samples for a
single attribute, such as sweetness,freshness or preference. The test is
partiuclary useful for sets of three to six samples that are to be evaluated by
a relatively inexperienced panel. It arranges the samples on a scale of
16
intensity of the chosen attribute and provides a numerical indication of the
differences between samples and the significance of such differences.
APPLICATION,TOOLS AND TECHNIQUE INVOLVED
The Principle of the test is it will present a question which is for example’’
which sample is sweeter?’’ (fresher or more prefer ) to each subjects one
pair at a time in random order. It will continue until each subject has
evaluated all possible pairs that can be formed from the samples.Evaluate it
with Friedman Statiscal Analytical Analysis.
The tools in this test used is the test subject should be slecet,trainand
instruct subjects as described in other test. Use no fewer than 10
subjects,discrimination is much improved if 20 or more can be used.
Ascertain that subjects can recognize the attribute of interest, by training
with various pairs of known intensity difference in the attribute. Depending
on the test objective, subjects may be required who have proven ability to
detect small differences in the attribute.
The test procedure for test controls and product controls is same with stated
before.
3.3 Multisample Difference Tests
There are several types of multisample difference tests, those are:
1. Multisample Difference Test: Rating Approach-Evaluation by Analysis of
Variance
2. Multisample Difference Test: BIB Ranking test ( Balanced Incomplete
Block Design)-Friedman Analysis
3. Multisample Difference Test: BIB Rating Test ( Balanced Incomplete
Block Design)-Evaluation by Analysis of Variance
17
3.3.1Multisample Difference Test: Rating Approach-Evaluation by Analysis of
Variance
Rating approach is used when the test objective is to determine in
which way a particular sensory attribute varies over a number of t samples,
where t may vary from 3 to 6 or at most 8 and it is possible to compare all t
samples as one large set. Subjects will rate the intensity of the selected
attribute on a numerical intensity scale in example a category scale. The
results also will be evaluate by the analysis of variance.
The subjects receives the set of t samples in balanced randomized
order in which the task is to rate each sample using the specified scale. The
set may be presented once only, or several times with different coding.
Accuracy is much improved if the set can be presented two or more times. If
more than one attribute is to be rated, theoretically the sample should be
presented separately for each attribute.
For example the hop character in five beers. The situation is a brewer
is producing a new brand of beer that is to have a high level of hop
character. He is brewing with five alternative lots of hops that cost $1.00,
$1.20, $1.40, $1.60 and $1.80/lb. The project objective is to choose the lot
that gives the most hop character for the money while the test objective is to
compare the resulting five beers for degree of hop character in which to
obtain a measure of the reliability of the results. 20 subjects evaluate the
samples on a scale of 0-9. The order of presentation is randomized and the
samples are presented on three separate occasions with different coding.
3.3.2 Multisample Difference Test: BIB Ranking test ( Balanced Incomplete
Block Design)-Friedman Analysis
BIB ranking test is used when the test objective is to determine in
which way a particular sensory attribute varies over a number of samples
and there are too many samples to evaluate at any one time. Typically, the
18
method is used when the number of samples to be compared is from 6 to 12,
or at most 16. The present method (ranking) is chosen when the panelists
are relatively untrained for the type of sample or relatively simple statistical
analysis is preferred. Subjects are asked to rank the samples according to
the attribute of interest.
For example the species of fish. The situation is where a military field
ration XPQ-6 ( fish fingers in aspic) has been prepared in the past from 15
different species of fish. The project objective is to compare the 15 species
such that quantitative information on the degree of fishy flavor is obtained
while the test objective is to compare fish fingers produced from the 15
species for degree of fishy flavor. A randomly selected group of 105 enlisted
personnel are randomly divided into 35 groups of three subjects each. A
schoresheet is prepared to ask the subject to rank his three samples
according to fishy flavor, from least (=1) to most (=3).
3.3.3 Multisample Difference Test: BIB Rating Test ( Balanced Incomplete
Block Design)-Evaluation by Analysis of Variance
Usage/Application
This method is used when the test objective is to determine in which
way a particular sensory attribute varies over a number of samples.
Basically, the number of samples to be compared is from 6 to 12, or mostly
at 16. The present method (rating) is chosen when panelists is trained to use
a rating scale and results need to be as precise and actionable as possible.
All t samples are presenting as one large block and then the subjects
were asked to rate the intensity of the attribute of interest on a numerical
intensity scale. The results will be evaluate by analysis of variance.
The subjects must be able to recognize the attribute of interest
example by training with sets of known intensity levels in the attribute. Not
fewer than 8 subjects are used because discrimination is much improved if
19
16 or more are used. Subjects may require special instruction to enable them
to recognize the attributes of interest reproducibly. Depending on the test
objectives, subjects may be selected who show high discriminating ability in
the attribute(s) of interest.
BIB rating test offer samples simultaneously if possible or else
sequentially. The order of presentation is truly random whereby the subjects
must not be led to suspect a regular pattern, as this will influence verdicts.
For example a problem given is where a QC manager of an ice cream
plant routinely screens samples of finished product to select lots that will be
added to the pool of quality reference samples for use in the main QC testing
program. The project objective is to maintain a sufficient inventory of
reference samples of finished ice cream for QC testing purposes while the
test objective is to rate the inventory of six lots each day for overall off-flavor
and discard any lot that may not be suitable as a reference. The samples of
the six lots are evaluated for overall off flavor by 15 well-trained panelists
who use a 10-point category scale from 0 (no off-flavor) to 9 (extreme off-
flavor). Each of the 15 panelists is randomly assigned one block of four
samples from the design. The order of presentation of the samples within
each block is randomized.
20
4.0 DESCRIPTIVE ANALYSIS TECHNIQUE
Descriptive analysis are applied in documenting product sensory
characteristics, identifying and quantifying sensory characteristics,
correlating instrumental and chemical measurements with sensory
responses, monitoring product quality, interpreting consumer responses,
sensory diagnostics of ingredient, processing or packaging changes,
prediction of consumer acceptance, and also used in matching of sensory
profiles in quality assessments. Not only that, the sensory profiles are used
in research and development and in manufacturing to define the sensory
properties of a target for new product development; to document product
attributes before a consumer test to help in the selection of attributes to be
included in the consumer questionnaire and to help in an explanation of the
results of the consumer test; to track a product’s sensory changes over time
with respect to understanding shelf life, packaging and many more; to map
perceived product attributes for the purpose of relating them to
instrumental, chemical or physical properties; and to measure short-term
changes in the intensity of specific attributes over time (time-intensity
analysis).
The principles used in descriptive analysis are it deals with perceptions
not with ingredients, causes or implications; it does not ask questions about
consumer acceptability; it uses panels consisting of trained or calibrated
observers; it uses well-defined terminology; data are quantified through
ratings of perceived intensities on scales; and it seeks to answer questions
about how products differ on specific sensory bases. There are four
21
components in descriptive analysis, which are, first characteristics
(qualitative aspect); second intensity (quantitative aspect which include
category scales, line scales, and magnitude estimation); third order of
appearance (time aspect); and lastly overall impression (integrated aspect).
6 Commonly Used Descriptive Test Methods.
1) The Flavor Profile Method.
It is an analysis of a product's perceived aroma and flavor
characteristics, their intensities, order of appearance, and aftertaste. An
amplitude rating is generally included as part of the profile. It provides a
general tool for characterizing the flavors of complex food products.
Moreover, the method is proved valuable for examining flavor differences
among foods that are functions of ingredient, processing and storaging
changes. Normally it is carried out by 5-8 panelists.
2) The Texture Profile Method.
The texture profile method was developed in order to define the
textural parameters of food. Later the method was developed to include
specific attribute descriptive to specific products including semisolid foods,
beverages, skin feel products and fabrics and paper good. Texture is a
sensory attributes that perceived by the senses of touch, sight and hearing
of human. The sensory analysis of the texture complex of a food in terms of
its mechanical, geometrical, fat and moisture characteristics, the degree of
each present, and the order in which they appear from first bite through
complete mastication.
22
3) The spectrum descriptive analysis method.
The spectrum Descriptive Analysis method’s principal characteristic is
that the panelists score the perceived intensities with reference to the pre-
learned “absolute” intensity scales. The purpose is to make the resulting
profiles universally understandable and usable, not only at the later date but
also at any laboratory outside the originating one. This method provides for
this purpose an array of standard attributes names with each with its set of
standards which define a scale of intensity usually from 0 to 15. The
philosophy of spectrum is pragmatic which provides the tools to design a
descriptive procedure for a given product category. The main principal tools
are the reference lists contained in spectrum’s appendices which are
together with the scaling procedures and methods of panels’ training. The
min aim is to choose the most practical method system which is given the
product in question, the overall sensory program, the specific project
objectives in developing a panel and the desired level of statistical treatment
of the data.
4) Time-Intensity Descriptive Analysis.
As food enters the oral cavity, travels over the tongue and is ingested,
flavor, texture, and even sound perception change due to the breakdown of
food. Conventional scaling procedures, used to evaluate e.g. flavor intensity,
require judges to average their sensory response over time. This yields only
an overall impression, with no information about the course of the sensation.
However, the time-intensity can overcome this. The (T-I) technique focuses
on the dynamic changes in food over the entire physiological process. The
changes in perception of taste, flavor, texture, irritation and odor over a
selected period of time can be precisely measured. The period of the
intensity of perception varies among products. The time-intensity studies can
be divided into three kinds, including long-term time-intensity studies,
shorter term time-intensity studies and the shortest term time-intensity
23
studies. Long term time-intensity can be applied on skin lotion studies- to
measure the reduction of skin dryness periodically over days. Shorter term
time-intensity track flavor and texture attributes of chewing gum over
several minutes. The shortest term time-intensity can be applied on the
measurement of sweetness and bitterness of certain products over several
seconds.
5) Free-Choice Profiling.
Free-choice profile (FCP) was developed in the 1980s which is a
sensory analysis method that can be carried out by the untrained panels.
The participants need only to be able to use a scale and be consumers of the
product under the evaluation. Free-choice profiling is actually a novel
technique developed by Williams and Arnold at the Agricultural and Food
Council in United Kingdom which they used it as the solution to the problem
of consumers using different terms for a given attribute. It also allows the
panelists to invent and use as many terms as panels need and can to
describe the sensory characteristics of a set of samples. The samples are
actually all from the same category of products and the panelists can
develop their own score sheet. The main advantages of the new technique is
that it saves much times by not requiring any training of the panelists other
than an hour of instruction in the use of the chosen scale. The second
advantage is that the panelist who has not been trained can still be
recognized as representing naïve consumers. However, questions regarding
the ability of the sensory analyst to “interpret” the resulting terms, combined
from all the panelists which need to be addressed. In order to give the
reliable guidance to the products researchers, the experiment or sensory
analyst must decide what does each of the combined term actually means.
Therefore, the words or terms for each resulting parameter come the
experimenter or sensory analyst rather than from the panelists. The results
may be colored more by the perspective of the analyst than the combined
weight of the panelists’ verdicts.
24
6) The Quantitative Descriptive Analysis Method (QDA).
The Quantitative Descriptive Analysis (QDA) method is developed by
the Tragon Corp because the other methods are lack of statistical treatment
of data. This method relies on the statistical analysis to determine the
appropriate terms, procedures and the panelists to be used for analysis of a
specific product. These probably will reduce the unnecessary bias such as
being dominated by the leader panel in discussion and scaling. The panelists
can be selected from large pool of candidates, as long as they successfully
passed standardized tests for olfactory, taste and color sensitivity as well as
for commemoration, verbal abilities and creativity. In this method, there is
also a leader panel. However, unlike the flavor profile test, the leader panel
acts as a facilitator, rather than a instructor and refrains from affecting the
group. The panelists are free to evaluate the samples and give their own
results in separate booths under defined condition such as temperature and
light. This will reduce distraction and interaction of the panelists and there is
no discussion among the panelists in this method. The result data or score-
sheets are collected once they finish evaluating, and the data will be entered
into computer for statistical analysis. One of the computer program CASA
(Computer Aided Sensory Analysis). The results are analyzed statistically and
graphic representation of the data will be applied. It is normally in the form
of a spider web with a branch or spoke from a central point of each attribute.
Spider-web plots are used to present data graphically.
25
Panelists work independently of one another. Booths can be used to minimize social influences. Discussion can follow or calibration purposes.
5.0 AFFECTIVE TESTS
5.1 Usage/Application
These are tests in which subjective attitudes, such as product
acceptance and preference, are measured. In affective tests the task is to
indicate preference or acceptance by either selecting, ranking, or scoring
samples.
26
Respondents are usually consumers who are selected on their current
or potential use of the product. In laboratory situations, consumer
demographics often are substituted in favor of accessible respondents (e.g.,
employees) whose preference and acceptance behavior satisfactorily
correlate with those of the target consumer population. Laboratory-type
acceptance tests can be done with 25 to 50 respondents. In field studies
where the target population is used, minimum numbers are increased by 75
to 200 or more. As a rule, technical, marketing, and administrative personnel
involved with the particular product should not be used in affective tests
because of their prior knowledge and potential for biased response.
The primary purpose of affective tests is to assess the personal
response (preference or acceptance) of current or potential customers to a
product, a product idea, or specific product characteristics.
Affective tests may be used for a variety of purposes including:
Product Maintenance
Product Improvement/Optimization
New Product Development
Assessment of Market Potential
Support for Advertising Claims
Affective tests are used mainly by producers of consumer goods, but also
by service providers such as hospitals, banks, and the Armed Forces, where
many tests were first developed. Every year, the use of consumer tests
becomes more common. They have proven highly effective as a tool used to
design products and services that will sell in large quantities or command a
higher price. Prosperous companies tend to excel in customer-testing
knowledge and, consequently, in knowledge about their consumers. Affective
tests can be qualitative or quantitative, depending on purpose. Whichever
27
type of test is used, care needs to be taken to ensure the sample of testers is
representative of the target population expected to buy the product.
5.2 Affective Test Methods—Fuzzy Front End
One of the affective test methods is the Fuzzy Front End. Uncovering
consumers’ needs often occur in the beginning, at the fuzzy front end.
Typically, the research is conducted at the very early stage of a project,
when planning is being carried out, initial market and technical feasibility is
being assessed, and breakthrough ideas are being explored. Research at the
fuzzy front end is conducted before dollars are committed to detailed
technical assessment, costly concept testing is executed and significant
manpower and out-of-pocket expenses are committed. This does not imply
that the tools and techniques applied to understand the consumer early
cannot be applied at all stages of the product development process.
Methods used are unique because they gather in-depth information on who
the consumer really is, how and why products are used, what they really like,
dislike, and need. To capture this level of information, one must move
beyond the standard, frequently used quantitative and qualitative
approaches.
The applications of research at the fuzzy front end allows the:
Exploration of consumers as purchasers of products with specific
features or sensory properties identified.
Study of product functionality and ergonomics.
Determination of how a consumer is modifying a product or adapting
usage to suit his/her needs.
Uncovering of attitudes, behaviors, and motivators within the culture.
Study of the consumers in their own environment through
observational research.
28
Beyond the traditional techniques used to elicit information from
consumers in focus groups or one-to-one interviews, information-gathering
approaches that are used in support of the fuzzy front end are often
imagery-based and include, but are not limited to, compare and contrast,
mind maps, word webs, and collages. Quantitative techniques that go
beyond CLT’s or HUT’s to consider include online research and intrinsic/
extrinsic studies. The online research provides early exploration into the
design of concepts, attitudes, and behavioral research. Intrinsic or extrinsic
research studies the essential aspects of a product along with the external
motivators.
5.3 Types of Affective Tests
There are two main types of affective tests, namely:
1. Qualitative
2. Quantitative; which may be further divided into:
i. Preference tests
ii. Acceptance tests
I. Qualitative tests
Qualitative affective tests are those (e.g., interviews and focus groups)
which measure subjective responses of a sample of consumers to the
sensory properties of products by having those consumers talk about their
feelings in an interview or small group setting. Qualitative methods are used
in the following situations:
To uncover and understand consumer needs that are unexpressed
(example: Why do people buy 4-wheel-drive cars to drive on asphalt?).
Researchers that include anthropologists and ethnographers conduct
open-ended interviews. This type of study, often called “the fuzzy front
end,” can help marketers identify trends in consumer behavior and
product use.
29
To assess consumers’ initial responses to a product concept and/or a
product prototype. When product researchers need to determine if a
concept has some general acceptance or, conversely, some obvious
problems, a qualitative test can allow consumers to discuss freely the
concept and/or a few early prototypes. The results, a summary and a
tape of such discussions, permit the researcher to understand better the
consumers’ initial reactions to the concept or prototypes. Project
direction can be adjusted at this point, in response to the information
obtained.
To learn consumer terminology to describe the sensory attributes of a
concept, prototype or commercial product, or product category. In the
design of a consumer questionnaire and advertising it is critical to use
consumer-oriented terms rather than those derived from marketing or
product development. Qualitative tests permit consumers to discuss
product attributes openly in their own words.
To learn about consumer behavior regarding use of a particular product.
When product researchers wish to determine how consumers use certain
products (package directions) or how consumers respond to the use
process (dental floss, feminine protection), qualitative tests probe the
reasons and practices of consumer behavior.
Qualitative tests include the use of:
1. Focus Groups
A small group of 10 to 12 consumers, selected on the basis of specific
criteria (product usage, consumer demographics, etc.) meet for 1 to 2 hours
with the focus group moderator. The moderator presents the subject of
interest and facilitates the discussion using group dynamics techniques to
uncover as much specific information from as many participants as possible
directed toward the focus of the session.
30
Typically, two or three such sessions, all directed toward the same
project focus, are held in order to determine any overall trend of responses
to the concept and/or prototypes. Note is also made of unique responses
apart from the overall trend. A summary of these responses plus tapes,
audio or visual, are provided to the client researcher. Purists will say that 3 ×
12 = 36 verdicts are too few to be representative of any consumer trend, but
in practice if a trend emerges that makes sense, modifications are made
based on this. The modifications may then be tested in subsequent groups.
2. Focus Panels (focus groups with a longer existence)
In this variant of the focus group, the interviewer utilizes the same group
of consumers two or three more times. The objective is to make some initial
contact with the group, have some discussion on the topic, send the group
home to use the product, and then have the group return to discuss its
experiences.
3. One-on-one interviews
Qualitative affective tests in which consumers are individually interviewed
in a one-on-one setting are appropriate in situations in which the researcher
needs to understand and probe a great deal from each consumer or in which
the topic is too sensitive for a focus group.
The interviewer conducts successive interviews with up to 50 consumers,
using a similar format with each, but probing in response to each consumer’s
answers.
One unique variant of this method is to have a person use or prepare a
product at a central interviewing site or in the consumer’s home. Notes or a
video are taken regarding the process, which is then discussed with the
consumer for more information. Interviews with consumers regarding how
they use a detergent or prepare a packaged dinner have yielded information
about consumer behavior which was very different from what the company
expected or what consumers said they did.
31
One-on-one interviews or observations of consumers can give researchers
insights into unarticulated or underlying consumer needs, and this in turn
can lead to innovative products or services that meet such needs.
All these methods involve small samples so findings usually need to be
further supported by larger scale, usually quantitative, studies. However,
small scale studies often supply insights that will be missed in large scale
quantitative studies which have, by their nature, to focus on specific
attributes. Small scale studies, on the other hand, give scope for probing
responses and trying to identify reasons behind response.
II. Quantitative tests
Quantitative affective tests are those which determine the responses of a
large group (50 to several hundred) of consumers to a set of questions
regarding preference, liking, sensory attributes, etc. Quantitative affective
methods are applied in the following situations:
To determine overall preference or liking for a product or products by a
sample of consumers who represent the population for whom the
product is intended. Decisions about whether to use acceptance and/or
preference questions are discussed under each test method below.
To determine preference or liking for broad aspects of product sensory
properties (aroma, flavor, appearance, texture). Studying broad facets of
product character can provide insight regarding the factors affecting
overall preference or liking.
To measure consumer responses to specific sensory attributes of a
product. Use of intensity, hedonic, or “just right” scales can generate
data which can then be related to the hedonic ratings discussed
previously and to descriptive analysis data.
Preference and acceptance tests should not use trained panelists
i. Preference tests
32
Simple preference test
Present two samples and ask: Which do you prefer?
You can either force a decision or allow a "no preference option"
If a "no preference" option is permitted, the no preference responses may
either be removed from the sample or randomly allocated to the either of the
two samples, either way, there is need for particular care in interpreting the
results of any expressed preference. From the point of view of a more robust
statistical analysis, the forced preference method. On the other hand, some
testers believe that "A happy panel is a better panel"
The simple preference test is very similar to the directional difference test.
Ranking tests
These involve asking subjects to put three or more samples in order of
preference. Care should be taken not to induce sensory fatigue by
introducing too many samples.
An alternative procedure to identify ranking of preference is to use
multiple paired preference tests. This can involve all possible pairs of three
or more samples or selecting one or two samples as controls and rating the
other samples against these.
Sample size
Preference Tests require a minimum of 30 assessors. 100 or more is better
ii. Acceptance tests
These tests are aimed at identifying a liking for a product. They can be used
for general liking or evaluation of specific attributes. It is possible to infer
preference from acceptance scores.
Caution needs to be exercised with attribute testing. e.g. Is the tester's
and panellists perceptions of sour/bitter/astringent the same?
33
Rating scales are generally preferred to a simple yes/no response as
they give an indication of degree of liking. It is important that rating scales
are balance i.e. the number of "like this" is equal to the number of "dislike
this" options. You should normally include a neutral response (neither like
nor dislike).
Note that rating scales are prone to central tendency errors i.e. a reluctance
to use the extremes of the scale, so a sufficient number of scale points
should be provided to counter this.
Types of Rating Scale
1. Category scales
Sample is assigned to one of a set of descriptive terms
2. Line scales
Mark a point along a line
The outer limits of the range are marked at each end of the line
3. Ratio Scales
Rates sample against some standard
Always involves a comparison
Needs highly trained panellists to achieve meaningful results
Examples of Rating Scale
1. Likeability scale (9-point hedonic scale)
34
2. 'Just
Right'
scales
3. Line or
Numerical Scales
Respondent places a mark on a line or gives a number to express the degree
of liking, e.g.
Please score the suitability of product X for use in ... meal
not at
all
suitabl
e
very suitable
for this
occasion
35
4. Likelihood to Purchase or Food-Action-Rating
Eat the whole portion & evaluate the sample on the basis of your experience.
Tick which statement best reflects your opinion
I would eat this at every opportunity
I would eat this very often
I like this & would eat this now and then
I would eat this if available but would not go out of my way for it
I don't like this & would eat it only occasionally ... etc.
Note: This scale as it stands in unbalanced. There needs to be equal number
of like and dislike points and there is no clear neutral response.
Advantages:
Provides essential information; bottom line
Can identify liking/disliking segments
Can be related to descriptive profile, other variables in optimization
Liabilities:
Consumer vocabulary fuzzy
Representative samples can be a problem
Preference may be ambiguous
Costs:
Consumer recruiting, qualification as users/likers
Technician time in setup, recruiting , analysis, reporting
Computing required if long questionnaire, large sample
Some products may require controlled facility (odors, noise, etc.)
36
37