Acknowledgements
This report has been produced by Fera Science Ltd. after exercise of all reasonable care and skill but is provided without liability in its application and use. The research was commissioned and funded by FSA and Defra. The views expressed reflect the research findings and the authors interpretation; they do not necessarily reflect Government policy or opinions.
Glossary
Executive Summary
Background
This project, funded jointly by FSA and Defra, involved an up-to-date review of the availability, capability and limitations of methods of analysis for the verification of the country of origin of food and feed. The project also engaged with stakeholders to understand their views relating to the geographical origin of food and feed. Following this review of the current status of origin testing, recommendations are made to build upon previous country of origin research to support the food and feed industry going forward.
Country of origin is defined as the country where food or feed is entirely grown, produced, or manufactured, or, if produced in more than one country, where it last underwent a substantial change. In the UK, EU-assimilated legislation states that indication of the country of origin is a mandatory labelling requirement for food and feed, including products such as meat, vegetables, eggs, honey and wine.
The country of origin claim plays an important role for consumers who tend to relate certain country of origin labelling to superior quality or brand identity. Patriotism (or ethnocentrism) can also play a role in consumer food choice. In Europe, there are 3500 products with a specific geographical origin and their production methods are officially protected (Protected Designation of Origin = PDO; Protected Geographical Indication = PGI; Geographical Indication (for spirit drinks) = GI). These goods often carry a premium price. In addition to customer preference and sale price, country of origin claims are important to businesses when they seek to (i) monitor food miles (carbon footprint), (ii) ensure sustainable sourcing of, for example soy and palm oil (including new Regulation (EU) 2023/1115 on deforestation-free products), (iii) avoid trading of goods which are subject to sanctions, (iv) reassure consumers over concerns of farming and animal welfare standards, (v) avoid foods which are linked to exploitation of farm workers, enforced, or child labour.
‘Verification’ of geographical origin involves testing against a database to confirm that the data for a sample are consistent with those for that geographical location as claimed on a product label. Verification therefore does not involve testing a sample from an unknown location to unequivocally identify its provenance, as such methods are not available or are extremely limited in scope.
Aims
The main aims of this project were two-fold. Firstly, to conduct an up-to-date review of the literature on analytical techniques used for the verification of the country of origin of food and feed commodities. The results of the literature review are included in this report. Each section of the review concentrates on a type of analytical technique and draws together the reports on commodities which have been analysed using that technique. A critique of the techniques is given together with recommendations on their capability and limitations. An outlook section for each technique provides insight for future development potential together with a summary of the most mature methods which demonstrate capability to verify origin for various food commodities.
Secondly, the project engaged with stakeholders. This part of the project ran in parallel to the literature review. The majority of stakeholders consulted are involved in food supply and enforcement with an aim to understand the benefits and challenges in origin verification to the industry. An additional stakeholder was included representing a successful global origin verification organisation in order to inform as a model example of managing geographical origin testing. Information from each of these stakeholders is detailed in this report.
As a smaller contribution to the project, data were also accessed to provide information regarding the number of global official notifications which have been logged for incidents relating to geographic origin declarations in food and feed over the last decade. These data demonstrate the scale of issues relating to geographical origin mislabelling (incorrect origin detailed on the label) in food and feed according to commodity and according to the nation notifying on the issue. Finally, information regarding costs to set up laboratories to verify origin claims, and also charges for commercial testing services, have been included to inform regarding future support in this area which would involve official control laboratories.
Findings and limitations regarding the technologies
Methods to verify origin are wide-ranging with most data usually resulting from the testing of stable isotope ratios combined with trace element profiles, in conjunction with chemometrics. No single methods are usually unequivocal for determining origin, but data can be used to verify if the origin testing outputs are consistent with the origin declared on a food label or in the paper trail. Databases of, for example, Stable Isotope Ratio Analysis (SIRA) and trace element data have been developed for certain commodities, the geographical scope of which varies greatly, often from small areas in a single country to a small number of areas within several countries. The quality and up-to-date status (to account for annual and seasonal variation) of these databases is critical to the relevancy of testing outputs. In addition to stable isotope ratio and trace element profiling, depending on the product, other technologies can often add value to origin data, including metabolomics, genomics, various lower-cost spectroscopy methodologies, blockchain and proteomics. Certain technologies will be more applicable than others, depending on the commodity.
A list of the most promising technologies for the main commodities for which research has been conducted is provided below:
-
Cereals: Trace element and SIRA analysis
-
Cocoa: Near infra-red (NIR) spectroscopy and sensory techniques with AI.
-
Coffee: SIRA analysis in combination with trace element analysis.
-
Fish and shellfish: While fish and shellfish represent the third most notified commodity for misdescription of origin, there is a particular challenge in assessing geographical origin of these commodities. Trace element profiles vary greatly due to a combination of natural and anthropogenic activities, climate change, the inherent fluidity of the marine and fresh water environment and depending on harvest time. It is therefore likely that the combination of several technologies will provide the most informative models overall, including trace element analysis, NIR and Rapid Evaporative Ionisation Mass Spectrometry (REIMS) study of lipid markers. Due to the wide range of varying factors, it will be imperative that databases are constantly updated to account for these variations.
-
Fruit juice: Trace element analysis.
-
Honey: Pollen analysis using light microscopy, study of volatile compounds, sugars, organic acids and amino acids have been used to differentiate between floral type of honey. Also SIRA and trace element analysis, metabolomic and genomic approaches in combination with blockchain.
-
Meat: SIRA, trace element analysis, with fatty acid profiling, along with radio frequency identification (RFID) to monitor livestock movement.
-
Olive oil and other edible oils: SIRA combined with NMR and profiling of phenolic compounds, fatty acid profile (including fatty acid methyl esters (FAMEs)), sterols, triacylglycerol (TAGs), volatile compounds and colour. The potential of Fourier transform infra-red (FTIR) spectroscopy ATR should also be considered.
-
Rice: SIRA with trace element analysis.
-
Tea: Trace element profiling.
-
Whisky: Gas chromatography (GC), liquid chromatography (LC), spectroscopy and trace element analysis.
-
Wine: SIRA and site-specific natural isotope fractionation nuclear magnetic resonance (SNIF-NMR) methodologies, with possible support from trace element analysis.
The commodities for which the largest numbers of incidents have been notified over the last decade relating to geographical origin are honey and wine, followed by fish, meat, olive oil and food/dietary supplements.
While blockchain could aid paperchain traceability in a weight-of-evidence approach and can be searched in seconds, a common opinion is that implementation costs of these systems are high. The second common concern is that blockchain requires honest participation in terms of, for example, correctly declaring the commodity, the reputability of the handling party and where they are located. At present, there is no means of validating that information logged in a blockchain system is genuine.
Stakeholder engagement
There were two parts to the stakeholder engagement. First, a stakeholder representing a successful working model of origin testing was interviewed to inform this project. World Forest ID, primarily a timber origin authentication organisation but also working increasingly in food origin verification for Forest Risk Commodities, was included in the stakeholder engagement to understand how this organisation successfully conducts origin testing.
Key aspects to the success of this model include:
-
Curation of the data by a single organisation.
-
Building databases which span global geographies.
-
Ensuring the authenticity of samples in the databases.
-
Regularly expanding databases to account for seasonal and annual variation, for example relating to climate change and other natural and anthropogenic variation.
-
Training to provide standardised methods to prepare the data.
-
Testing by harmonised methods to provide confidence in the accuracy and reproducibility of the data.
-
Inter-laboratory data comparisons to monitor reproducibility.
-
Sharing of data so that databases they are publicly accessible and transparent.
Other working examples of food origin testing include SIRA and trace element databases funded by the consortium of producers of Grana Padano (PDO) and Parmigiano Reggiano (PDO) cheeses to monitor the origin of retail samples, with successful inter-laboratory validation.
The main outputs of the engagement with stakeholders from elsewhere in the food and feed industry were as follows:
-
Stakeholders in the food and feed industry are aware that there is no ‘silver bullet’ for origin testing.
-
In terms of testing activities, food safety testing is the main focus of the industry followed by species identification. Origin testing is a lower priority.
-
Stakeholders are aware that stable isotope ratio analysis and trace element analysis tend to be the most applicable technologies for assessing the origin of the majority of commodities and that these tests can be used to interrogate the accuracy of food and feed traceability documents.
-
Stakeholders highlighted the lack of accredited services and proficiency testing as a barrier to their origin testing.
-
Stakeholders emphasised that databases can be out of date since they have often not been added to for some time which is necessary to account for recent variations and that the data are therefore less than ideal.
-
It was highlighted that many databases are not publicly accessible, accessed only by testing service providers, and stakeholders reported that the quality, composition, applicability to different matrices and current relevance of the databases is not always transparent.
-
Stakeholders stated that, while many of the larger members of the food supply chain are aware of the importance of accurate origin labelling and perform testing to challenge label information, more support is needed for smaller sized organisations. Education at these organisations may enable more issues to be picked up earlier in the supply chain and ultimately reduce risk of more widespread fraud.
Conclusions
-
There is no unequivocal single technique to verify the claimed country of origin of food and feed.
-
Testing methods tend to inform as to whether data are in line with labelling or certification claims but rarely categorically assign origin.
-
The maturity of the testing for origin varies greatly across commodities and geographical locations.
-
The validity of studies undertaken to date is highly dependent on the number and authenticity of the samples used as a reference collection.
-
Combined SIRA and trace element analyses are the most widely used. Depending on the commodity, other technologies such as genomics, metabolomics and spectroscopy can add value.
-
Multivariate data generally improve classification rates and robustness.
-
Livestock can move geographically during their lifetime. From an analytical perspective, it is therefore challenging to identify meat origin, and both geography and feeding regimes must be considered.
-
Validity of studies undertaken to date is highly dependent on the number, nature and geographical range of the samples used in a reference database. In many cases the reference databases are currently inadequate or outdated.
-
Curation of databases, and standardisation of data capture, and methods of analysis, is key and is lacking.
-
The lack of continued expansion of previously established databases renders datasets out-of-date due to lack of account of seasonal and annual variation. Before the databases can be used in the future, old data must be tested against contemporary samples to examine temporal trends or to establish whether data have remained stable in the intervening period. While such multivariate tests can be completed relatively easily, this adds delay if an issue arises in the supply chain, during which time samples cannot be tested in the probable time requirement for fresh food or to investigate the early stages of an issue before the issue becomes widespread. Ideally, databases would be maintained so that they are fit-for-use as and when needed.
-
Databases should be regularly challenged with new/additional samples to ensure continued relevance.
-
Obtaining authentic matrix-matched reference materials to support testing is a critical problem which requires funding to achieve best practice under the principle of identical treatment.
-
Database sharing is poor due to investment and IP concerns. This can limit progress in verification of origin.
-
Standardisation activities are under development by AOAC, CEN, CODEX, however these activities take time for completion.
Other barriers to geographical origin verification relate to:
-
Costs – the main mature methods used require mass spectrometric analysis which requires high-cost instruments and skilled analysts. Investment in building and maintaining databases along with a program of regular food testing would mean that a higher number of samples is tested and the methods would become more cost effective. Investment is also required to continue to develop and assess lower cost methods such as spectroscopy.
-
Demand for proficiency testing is lacking for geographical origin determination and this is impeding the trustworthiness of the data. Transferability of data from one lab to another must be demonstrated by proficiency trials.
-
Few methods to verify origin are accredited and the accreditation process can take twelve months. Means to fast-track methods for accreditation are needed when a method is specifically developed to address and investigate a known issue in the supply chain.
-
There is a challenge to present multivariate data in court to support origin fraud. Only limited methods have been considered ready to be used in court.
-
Due to the nature of the methods, should a product be a composite product or be adulterated and contain a commodity from a mixture of origins, this is problematic for origin verification and the achievable limits of testing for adulterated samples must be understood.
Future direction
Recommended future direction is detailed below:
All stakeholders highlighted challenges with a lack of quality data or up-to-date data in the various databases which have been prepared over the years in support of food provenance testing. In the future, investment could be made to support the building of robust datasets, necessitating the harmonised collection and analysis of a high number of authentic samples from a large (e.g. global) geographical range for commodities which are vulnerable to origin fraud.
These datasets should be curated by a single organisation to facilitate harmonisation. Datasets should be regularly expanded with samples to account for seasonal and annual variation. The single organisation should be funded so as to allow continued expansion and curation of the datasets over time so that a database does not lose relevance and is fit-for-use as soon as required for any supply chain issue which may arise. The suitability of the datasets should be challenged each year by the testing of additional samples (i.e. not the samples already included in the prediction models) and all testing facilities involved should take part in proficiency trials and achieve acceptable data, demonstrated through statistical measurements, for example z-scores.
Facilitate the fast-tracking of methods for accreditation when a method is needed to address and investigate a known issue in the supply chain.
More matrix-matched certified reference materials should be available which requires investment.
Demand for proficiency testing in origin analyses should be encouraged and as a result of this, certified testing schemes, which are currently lacking, would be initiated.
There is much existing data from other surveillance testing exercises which are not yet/rarely used in origin verification and which could have value for origin determination. This could be explored. Investment could be made to initiate the collation of these existing data and metadata and investigate their utility for origin verification purposes. Prediction models could be built to determine the relevance of these data for addressing origin issues. Such existing data could include environmental and geological data, pesticide residue data, dioxin and polychlorinated biphenyl levels (particularly relevant to foods including salmon), viral screen data and fungi species presence information.
There is scope for the emergence of other methods, either for initial, fast screening or more detailed analysis. While hand-held spectroscopy technologies have been used in small proof-of-principle studies, multivariate analysis may be considered as a natural next step in the quest to verify origin, fusing data from various technologies.
Opportunities for public, private and inter-nation funding to improve methods in geographical provenance could be sought to protect our supply chain.
Government funding and private and EU funding opportunities in origin determination could be considered to expand collaborations between those working on building datasets for a commodity (or range of commodities) across different geographical locations and, in turn, to expand the geographical range of datasets. Specific high value commodities that generate large tax incomes could be prioritised for engagement since these industries could more readily support the testing costs. An example of such includes Scotch whisky.
1. Background to the project
The constantly evolving network of global food/feed trade has led to an unprecedented variety of foods available to consumers. At the same time consumers are increasingly interested in the provenance of food, especially in view of various food scandals, from contaminated eggs, substitution of beef with horse meat in ready meals, mis-labelled beverages and more recently ‘Operation Hawk’[1] (pork from South America and Europe falsely labelled as British) and FSA National Food Crime Unit investigations into the geographical provenance of corned beef.
The country of origin claim plays an important part for businesses when they seek to (i) monitor food miles (carbon footprint), (ii) ensure sustainable sourcing of for example soy, palm oil and (iii) avoid trading of goods which are subject to sanctions. In addition to the above, consumers tend to relate certain country of origin labelling to superior quality or brand identity (for example Parma ham, New Zealand Mānuka honey and Modena Balsamic vinegar) or safety (linked to expected high animal welfare standards and control procedures in place), so that products are marketed at higher prices than similar products of other or unknown provenance. High volume products such as fruit and grain are also vulnerable to origin fraud due to the large amounts of related value. Fragmented supply chains, often due to geopolitical situations can also result in the supply chain becoming more vulnerable to origin fraud (for example Chinese raspberries being described as Chilean. Unscrupulous producers might buy cheaper raw materials from other regions /countries and illegally sell the product with different origin labels for financial gain. Since leaving the EU, the UK government is considering the introduction of labels on UK-grown food to protect farmers and consumers over concerns of welfare standards in some food imports.
Country of origin is defined as the country where food or feed is entirely grown, produced, or manufactured, or, if produced in more than one country, where it last underwent a substantial change. In the UK EU-assimilated legislation states that indication of the country of origin is a mandatory labelling requirement for food and feed, including products such as meat, vegetables, eggs, honey and wine[2],[3] (Appendix 1).
In Europe, there are 3500 products[4] with a specific geographical origin and their production methods are officially protected (Protected Designation of Origin = PDO; Protected Geographical Indications = PGI) – EU 664/2014[5]. These are registered in a public database that was launched in 2019: eAmbrosia[6] - the European Union (EU) Geographical Indications register. Since January 2021, following the EU exit, the UK GI (geographical indication) schemes and logos[7] have been established:
Designated origin - UK protected (PDO), Geographic origin - UK protected (PGI) and Traditional Speciality - UK protected (TSG). The principles and benefits of the GI scheme remain the same and continue to provide intellectual property protection.
‘Verification’ of geographical origin involves testing against a database to confirm that the data for a sample are consistent with those for that geographical location. Verification therefore does not involve testing a sample from an unknown location and unequivocally identifying its provenance.
In partnership with Defra, the FSA conducted a study in 2014[8] which used stable isotope ratio analysis (SIRA) and paper-based approaches to assess whether foods claiming to originate from the UK and Ireland were as claimed. It examined 96 food samples of which 78 samples were shown to be consistent with the origin claimed, while 18 were identified for follow-up investigation and later confirmed as consistent using traceability documentation. The study demonstrated that SIRA had potential to verify country of origin claims when used in combination with traceability and other evidence, but had some limitations, including a need for comprehensive databases to support the method[9]. This triggered the supporting project (FA0155) during the EU FoodIntegrity Project[10] (2014-2018, Collaborative Project: 613688, Seventh Framework Programme, KBBE.2013.2.4-01: Assuring quality and authenticity in the food chain) which aimed to address database issues raised in this study to maximise their sharing, harmonisation and use to verify food origin, so contributing to assuring food integrity.
Defra has also commissioned several research projects to look at country of origin verification methods for a variety of foods between 2012 and 2019[11],[12],[13],[14],[15],[16]. Techniques investigated included isotopic analysis, deoxyribonucleic acid (DNA) speciation, single nucleotide polymorphism (SNP) genotyping, metagenomics, and commodities investigated included fish/fish products, egg, oysters, chicken, cheese (Blue Stilton) and beef. In general, processed products for example sausages available containing multiple ingredients present additional challenges and determining where ‘substantial changes’ may have occurred during production can be complicated. For these techniques robust reference databases and reference materials can be limited.
In the EU, new regulations were brought about following the horsemeat scandal, including Regulation (EU) 2017/625 that focuses on enhancing the food control systems of the EU and its member states. This regulation not only includes food safety, but also food authenticity (Kulling et al., 2019). The European Commission for the first time recommended setting up so-called European Union reference centers for the authenticity and integrity of the agri-food chain, with responsibilities to: (1) provide specialised knowledge in relation to food authenticity and integrity and to the methods for detecting fraud; (2) provide specific analyses designed to identify the segments of the agri-food chain that are more vulnerable; (3) establish and maintain collections or databases of authenticated reference materials; (4) disseminate research findings and technical innovations. One of these centres was created in Germany, NRZ-Authent, as part of the Max Rubner Institut (MRI), the Federal Bureau of Statistics and public prosecutor’s offices. NRZ-Authent comprises infrastructure modules dedicated to IT, databases, accreditation and National Reference Laboratory functions, and also, thematic modules, one of which focuses on geographical origin verification. Standardisation and validation of methods is a key objective of this centre, as is the creation of a database of analytical methods available to food control authorities, including information about relevant control authorities, academic and private laboratories and companies. This way, they aim to have an up-to-date map of capabilities in Germany that will facilitate the identification of research gaps and decision making regarding further developments. Having left the EU, the UK must plan strategies to verify the geographical origin of food.
This project aims to conduct an up-to-date review of the availability of methods of analysis suitable for the verification of the country of origin of food and feed, and to provide recommendations on their capability and limitations, and applicability to the delivery of official controls, taking into account the current regulatory position and the fact that many public analysts do not have in-house access to many of the more sophisticated techniques such as IRMS. The project will provide practical recommendations in future direction to progress the UK’s capability for origin verification. This will be delivered through review of the literature and engagement with stakeholders.
2. Literature review
2.1. Introduction
A wide range of analytical techniques have been used to support the determination of geographical origin of food and feed. In 2008, an overview of these analytical methods was published by Luykx and van Ruth (Luykx & van Ruth, 2008) that included mass spectrometry, spectroscopy, separation methods and other techniques. This article highlighted that a combination of methods measuring different types of analytes seemed to be the best approach to establish the geographical origin of food, with chemometric analysis of the ‘fused’ data supporting the interpretation of such a multifactorial approach. Since then, many more articles have been published on the topic, and the main technologies are still at the forefront, but many others have been applied to the determination of geographical origin of food and feed, and important advances have taken place on computational tools to support the analytical technologies.
A review of published international literature has been conducted in this project to identify current and emerging methods of analysis for the verification of the country of origin of food and feed, covering the past 10 years. This includes peer-reviewed literature and grey literature (the latter being searched by Google and including, for example, information from the media, institute proceedings and trade journals).
This review provides a critical assessment of techniques for country of origin verification of food and feed, including SIRA, trace element analysis, DNA methods, SNP genotyping, proteomics and metabolomics. Emerging methods to verify geographical origin and availability of quality reference databases, reference materials and proficiency testing schemes are also addressed.
2.2. Methodology
A list of relevant keywords, text phrases and the date range to be used in the literature searching elements of the project was agreed with FSA and Defra (Table 1). These keywords and text phrases were used to search Web of Science. Additional web searches to identify literature were also completed.
A framework to grade each article was agreed by the project team prior to filtering. Each article was assigned a grade relating to relevance and the robustness of the data. An initial 5 star rating was applied (star rating 3 to 5 for large authentic database/established technique or standard/review respectively), articles which were deemed non-relevant (star rating 1, for example not fitting the country of origin topic) or of non-robust data (star rating 2, for example method only validated in house with fewer than 50 samples) were filtered out. As a quality check, a second team member reviewed the list of articles which had been retained and filtered out.
Since the number of articles selected through the rating system was too high, a second round of selection was applied, giving more focus to recent articles and reviews and relevant papers therein.
3. Review of techniques
The content of the review has been structured around technology categories, and within these, the information has been split by commodity where possible.
3.1. Stable Isotope Ratio Analysis (SIRA)
Traditionally, SIRA is the method of choice to determine geographical origin of food and feed, using an elemental analyser coupled to an isotope ratio mass spectrometer (EA-IRMS) for bulk analysis (for example defatted meat protein, olive oil, wine, honey), since the isotope signals from the bio-elements (H, C, N, O, S) present in local feed and water transfer to the animal and plant tissue, and are in turn linked to their local environment. There are three scenarios in which SIRA can be applied to verify geographical origin, (i) upfront investment to generate and curate a database for a specific food (for example EU wine database (Christoph et al., 2015), (ii) keeping a library of reference samples to compare to (for example Agriculture and Horticulture Development Board (ADHB) pork and (iii) simple batch to batch comparison on an ad hoc basis.
A more niche application concerns compound-specific analysis (for example vanillin, other flavours or organic acids) which is performed by a gas chromatograph coupled to IRMS (GC-IRMS). More recently, a system of liquid chromatography coupled to IRMS (LC-IRMS) has evolved from questions of food adulteration (for example individual sugars of honey) for geographical characterisation of Italian grape musts (Perini & Bontempo, 2022); (Perini et al., 2023).
Of the manuscripts identified as using SIRA to determine geographical origin only 15 manuscripts related to methods for which a database was available (n>200 per isotope), reflecting that the majority of approaches published have not yet been tested on a sufficiently large number of samples to prove their potential to successfully determine geographical origin for food commodities. A table summarising details of the databases covered by these 15 articles is presented below (Table 2), and further details are shown in Appendix 2. Methods tended to show improved classification rates and robustness when SIRA was used in combination with other analytical techniques to verify geographical origin, namely trace elements, fatty acids, nuclear magnetic resonance (NMR) and DNA or with environmental data of isoscapes (maps representing the natural variation in stable isotope, or indeed trace element, composition) for δ2H and δ18O of precipitation and /or ground water. Analysis by EA-IRMS is more prominent than GC-IRMS and LC-IRMS.
Various food fraud databases are listed by the European Commission[17]. Those which particularly hold data to address food and feed geographical origin testing include EU Wine DataBank, Oleum DataBank (olive oil), Honey DataBank, Horse meat database.
A summary of articles selected for the top commodities (highest percentage of publications within the time frame of the literature review, namely: wine, meat, rice, olive oil and coffee) found in the searches is presented in the sections below.
3.1.1. Wine
In the EU, each member state analyses the isotopic composition of a certain number of wine samples per vintage year as a commitment to the EU wine data bank which was established in 1991 for the control of the wine sector. The accepted analytical methods for the determination of the required isotope ratios ((D/H)I, (D/H)II, R-value, δ18O and δ13C) are IRMS and SNIF-NMR. On the basis of the isotope data of the vintage 1997 to 2004, average values for Vienna, Lower Austria, Styria and Burgenland could be calculated (Philipp et al., 2018) and some trends identified. However, prior knowledge of the vintage year remains advantageous in order to verify wine region of origin. The EU Wine Databank is unique in that it was initially developed to address general wine authenticity issues (detecting adulterations with sugar or water) using SIRA and SNIF-NMR. However, the databank was extended to address geographical origin issues. The vintage and region of production is usually explicitly claimed on the label and so an appropriate sub-section of a larger database can be used for comparison and verification. The (D/H)I and (D/H)II ratios of the methyl group in the ethanol of a wine was later exploited for origin verification purposes. The (D/H)I ratio is linked to botanical origin (C3 plant), while the (D/H)II ratio of the methylene group gives indication of the fermentation water and therefore the geographical origin of a wine.
The integrity of 50 European wines sold in China was checked, based on anthocyanin composition, stable isotopes and glycerol (Müller et al., 2021). Glycerol is a by-product of fermentation that provides a perception of sweetness and improves ‘mouth feel’. It is sometimes added to wine, as an adulterant, for these reasons. As stated by these authors, in general, wine that has been diluted with tap water shows decreased δ18O values due to discrimination against the heavier isotopes in the plants during evapotranspiration. There is a consensus that δ18O in wine water in southern Europe typically does not fall below -2 ‰. In colder and generally more humid central Europe, wines usually do not show δ18O values below -5 ‰. The authors reported that European-labelled wines from China ranged between δ18O -6.1 ‰ and 8.7 ‰. On this basis, it was concluded that five samples were clearly mislabelled as French wines as they showed δ18O values below -2 ‰, two of them even below -5 ‰.
Another research group, Wu et al., combined SIRA and trace element profiling in conjunction with carbon and oxygen isoscapes, applying partial least squares discriminant analysis (PLS-DA) and using an Artificial Neural Network (ANN) to verify the origin of 240 French red wines from four regions (Bordeaux, Burgundy, Languedoc-Roussillon and Rhone) with verification accuracy of 98.2% (Wu et al., 2021) and 600 red wine from 8 countries (France, Italy, Spain, USA, Chile, South Africa, Australia and China) with overall discrimination accuracy of 83.9% (Wu et al., 2019).The wines had been sourced from Shenzhen Customs and also via reliable wine importers prior to comparison with Chinese wines.
3.1.2. Meat
There are many reports where stable isotope data has been used to support the development of models for the determination of geographical origin of beef, pork and lamb. For example, Park et al. verified the origin of 252 defatted Korean (Korean Institute for Animal Products) and non-Korean (n= 94, Korean Meat Import Association) pork samples, by combining trace element content and their isotopes (27 isotopes of 19 elements) with stable isotope ratios of carbon and nitrogen, and resulting in a discrimination rate of 100% for their multivariate chemometric linear discriminant analysis (LDA) approach (Park et al., 2018). Particularly, the isotope ratios of the heavy elements: 88Sr/86Sr, 88Sr/84Sr, 208Pb/207Pb,208Pb/206Pb, 114Cd/111Cd, 114Cd/110Cd, 112Cd/111Cd, and 112Cd/110Cd, for Korean pork samples have shown that they are greater compared to the non-Korean ones (Chilean, Mexican, Spanish, Canadian, German, US American).
Pianezze et al. (2021) critically reviewed publications, which applied SIRA to tracing the geographical origin of lamb and confirmed that isotope ratios of carbon mainly discriminate difference in animal diets, while the isotopic ratios of deuterium and oxygen tend to discriminate geographical origin. Models described in their review could reach total accuracy of 90% or more. The authors stated that these results could potentially be improved even further by considering additional variables such as fatty acids and trace mineral elements. They recommended to create a lamb-based reference material to enable the comparability of isotope ratio data which is specific to PGI and PDO lamb products across multiple laboratories (Pianezze et al., 2021).
Most recently, Bontempo and Perini published the evaluation by principal component analysis (PCA) of the hydrogen (2H/1H), carbon (13C/12C), nitrogen(15N/14N), and sulphur (34S/32S) isotope ratio data from 227 authentic and defatted beef samples (Framework 6 EU project ‘TRACE, Tracing Food Commodities in Europe’, Grant Number 6942), which were collected from 13 sites in eight countries. Whereas country of origin could not be determined in all cases, coastal and inland regions, areas of different latitudes and breeding systems were successfully discriminated. The PC1/2 plot showed 77.5 % data variation and revealed a clustering into three groups a) coastal areas at lower latitudes, using C4-based diet (Barcelona, Chalkidiki, Sicily and Tuscany), b) coastal areas at higher latitudes (Cornwall, Orkney and Bohernagore) and c) inland areas at lower latitudes, using a mixed diet (Limousin, Frankonia, Allgäu, Gäuboden, Mühlviertel and Trentino). The results presented by the authors are encouraging and they recommend enlarging the isotope databanks for each country, as this would help to verify the authenticity claims of beef samples and not only protect, but also promote PDO and PGI products, for example Orkney beef (Bontempo et al., 2023).
3.1.3. Rice
Rice is an important staple food in China and its authenticity is closely associated with nutrition and safety (Z. M. Li et al., 2022). In 2017, 900 Japonica and Indica rice samples from 17 Chinese provinces were collected to establish a comprehensive database for stable isotopes (H,C,N,O) and trace elements (K, Mg, Ca, Na, Fe, Zn, Mn, Cu, Ni, Cr and Mo), with the goal of origin discrimination of four regions, Middle-Lower Yangtze Plain, Northeast, Southwest and Southeast. Qualitative models were built using back propagation artificial neural networks (BP-NN). All the samples were randomly divided into a training set (70%), a validation set (15%) and a testing (blind) set (15%). Classification of Japonica and Indica rice from different regions was achieved with a high accuracy (97.2% and 77.9%, respectively), but annual climatic variations and environmental factors influencing these fingerprints need further exploration (C. L. Li et al., 2022).
Sheng et al. (2022) used a similar set of 794 rice samples from 2017 to build isoscape models with focus on environmental similarity which allowed the prediction of the geospatial distribution of δ13C, δ2H and δ18O values for Chinese rice in 2018 and to validate the results with 132 actual samples from 2018. The climate data was derived by interpolating gauged daily temperatures, precipitation, relative humidity and sunshine duration from 825 Chinese Meteorological Administration stations[18] in 2017. For the predictive model rice samples were randomly divided into 555 training samples (70 %) and 239 validation samples (30 %) and these dataset divisions were repeated 10 times. The training samples were used to develop an isoscape model, and then the validation samples were used to evaluate the prediction model. The mean correlation coefficients (r) for observed and predicted δ13C, δ2H and δ18O values were 0.82, 0.81 and 0.63 respectively, meaning that the four main production regions in China: Northeast China, middle to lower Yangtze River plain, Southwest China and Southeast China could be distinguished (Sheng et al., 2022).
3.1.4. Olive oil
Olive oil and in particular extra virgin olive oil are highly priced and linked to fraudulent activities by blending with lesser seed/nut oils or mislabelling of country of origin, for example trading Spanish as Italian olive oil. The latter holds the highest price on the current olive oil market[19].
There are numerous examples in the literature of application of SIRA for geographical verification of olive oils. In a study by Jiménez-Morillo and co-workers, virgin olive oils (VOO, n= 138, 2016 and 2017), prepared within 24h from harvest of olives, from three different Mediterranean countries (Portugal, France and Turkey) were analysed. The countries could be discriminated on the basis of multivariate statistical analysis of geoclimatic and isotopic data (δ13C, δ2H and δ18O) based on bulk analysis by EA-IRMS (Jiménez-Morillo et al., 2020).
During the harvest period 2015-2016 of Koroneiki olives in Greece, a total of 210 extra virgin olive oil (EVOO) from 6 different regions were analysed for isotopic values of 18O (bulk oil) and 13C (biophenolic extracts and bulk oil) (Karalis et al., 2020). For comparison, data also included harvests from 2005/2006/2010 which allowed to monitor climatic changes over time. A relationship between δ18O Water (springs used for irrigation) and δ18O EVOO was also demonstrated. Regarding the δ13C values, results for bulk EVOO and biophenolic extracts were very similar and can be interchangeably used for differentiating Ionian Island olive oils from olive oils originating from Crete and Chalkidiki, when applied in conjunction with δ18O EVOO.
The addition of fatty acid profiles (concentration and δ13CFA) to the stable isotope ratio of the bulk EVOO (H, C and O) has been recommended for improving the differentiation of vegetable oils from the Southern and Northern hemisphere (Spangenberg, 2016).
Paolini and collaborators describe their in-house method validation of FAMEs analysis of olive oil triglycerides by GC-C/Py -IRMS for δ13C and δ2H values as a new tool for improving the capability to differentiate bulk EVOOs geographically. They generated repeatability data (n=10) for an EVOO and FAME standards (methyl linoleate, methyl oleate, methyl palmitate and methyl stearate) and reproducibility data for five olive oils, whilst using reference materials icosanoic acid methylesters USGS70 and USGS71 for normalising the raw data. They achieved suitable measurement uncertainty of ± 0.3‰ and ± 3‰ for the δ13C and δ2H, respectively. The authors also established that the contribution of the methylation group (established by EA-IRMS and SNIF-NMR) to the δ-values was negligible (Paolini et al., 2017).
3.1.5. Coffee
Coffee is highly vulnerable to food fraud, and beyond the traditional analytical methods, an effective and practical toolbox for coffee geographical origin determination is needed (Sim et al., 2023a).
For a novel approach, Sim and co-workers performed near infrared spectral imaging of green arabica coffee beans (6 samples per region) from 10 global regions in Africa (Ethiopia, Kenya) and South America (Columbia, Peru) to demonstrate that compositions of 5 stable isotope ratios (H, O, C, N, S) and 33 trace elements (l, B, Ba, Ca, Cd, Ce, Co, Cr, Cs, Cu, Dy, Er, Eu, Fe, Gd, K, La, Li, Mg, Mn, Mo, Na, Nd, Ni, Pb, Pr, Rb, Se, Sm, Sn, Sr, Y, and Zn) could be predicted via modelling (PLS-Toolbox & SOLO 9.0- Eigenvector and R-Studio version 4.2.0). The R2 ranged from 0.69 and 0.93, with five elements (Mn, Mo, Rb, B, La) being moderately to well predicted whereas three isotope ratios (δ13C, δ18O, δ2H) were well predicted by NIR. Three elements were found as particular markers for country of origin (set of 60 samples), with chromium and lead only being found in African, and selenium in South American samples (Sim et al., 2023b).
A recent article described a novel approach whereby stable isotope and trace element analyses were combined with non-linear machine-learning data analysis to improve coffee origin classification and marker selection (Sim et al., 2023a). Specialty green coffee beans (2021 -2022) were sourced from three continents (Africa, Central America, South America), eight countries, and 22 regions alongside metadata (altitude, mean annual temperature, annual rainfall). By using ensemble decision trees, random forest and extreme gradient boost, accuracies for continental (0.94) and Central American (0.89) regional models showed improved data compared to partial least squares discriminant analysis (PLS-DA) on its own.
The oxygen isotope ratio of α-cellulose extracted from roasted coffee beans (n = 49 from 21 different countries) has been described as a useful indicator to determine region-of-origin, as it integrates source water and climate signals (Driscoll et al., 2020). Cellulose is unaffected by the roasting process and thus the δ18O values of cellulose can be used at point of sale as no measurable isotopic variability arises from roasting-related changes in the chemical composition of the coffee. The authors analysed the α-cellulose to calculate the δ18O value of coffee bean water based on the known relationship between the δ18O value of cellulose and that of the water at the site of synthesis. Hence, the 18O enrichment of coffee bean water was modelled as a function of local relative humidity, temperature and source water δ18O value. This function was incorporated into a mechanistic model of cellulose δ18O values to predict the δ18O values of coffee bean cellulose across coffee-producing regions globally (isoscape). The δ18O values of the α-cellulose ranged from 22 to 42 ‰ and the modelled values fell within +/- 2.3 ‰ of the measured values. This excellent agreement of practice and theory will allow for the author’s model to be used to limit the scope of possible origins prior to the measurement of additional isotopic or trace element parameters.
3.1.6. Honey
Honey is a high value product and assessing its authenticity, geographical origin and floral source is of high interest. Baroni et al. (2015) combined data for 33 trace elements (including K/Rb and Ca/Sr ratios) and carbon (bulk and protein) and strontium isotopes of Argentinean honeys (n= 79, harvested across 2007 to 2009 by bee keepers) for three different regions (Buenos Aires, Córdoba, and Entre Rı́os) and demonstrated the link of honey to soil and water by applying different chemometric tools (generalised procrustes analysis – 91.5% and canonical correlations – r2 = 0.99) (Baroni et al., 2015).
Kalashnikova and Simonova (2022) studied hydrogen, oxygen and carbon stable isotope ratios of unadulterated honey samples (n=91) taken from Russian regions with different climatic characteristics. The values of isotope compositions varied from -29.5 to -24.2 ‰ for carbon, from -116.6 to -34. ‰ for hydrogen, and from 12.7 to 25.7‰ for oxygen. They found that the average δ2H and δ18O values in honey correlated with the average δ-values of atmospheric precipitation in the regions of the honey origins. The isotopic composition of carbon was also affected by climate; three zones of “isotopic landscape” for regions of Russia were identified: Siberian honey samples had the lowest δ13C, δ2H and δ18O values; honeys from the European part of Russia had intermediate δ-values, whilst honeys from the Black Sea region had the highest δ-values (Kalashnikova & Simonova, 2022).
During 2015 to 2017, 373 Turkish pine honey (honeydew honey) samples, produced under controlled conditions without sugar feeding, were collected from 47 sites across 3 different regions (mostly from the Aegean coast, with some from the Marmara and Mediterranean shore). δ13Cprotein, δ13Choney, C4% sugar, sugars (fructose, glucose, sucrose, and maltose) and a multitude of physicochemical properties - electrical conductivity, moisture, ash, free acidity, colour CIELAB attributes, optical rotation [α]20, proline and diastases activities - were determined alongside melissopalynological analyses [number of honeydew elements /number of total pollen (NHE/NTP) ratios (> 3 to be classed as honeydew honey)]. The results showed that all physicochemical parameters exhibited similar values to those of pine and honeydew honeys from other countries, except for C4% sugar (with a mean value of 6.40 ± 5.00 %). By applying PCA, the origin of the pine honey samples could be assigned to three different groups (South Anatolia, North Agean and high-altitude locations) (Uçurum et al., 2023).
3.1.7. Cheese
A working example of using SIRA (and trace element analysis) to protect food against mislabelling relates to PDO cheese such as Parmigiano Reggiano and Grana Padano. Camin et al., 2015 reported an international collaborative study based on blind duplicates of seven hard cheeses was performed according to the IUPAC protocol and ISO Standards 5725/2004 and 13528/2005. The H, C, N and S stable isotope ratios of defatted cheese determined using Isotope Ratio Mass Spectrometry (IRMS), alongside trace element analysis, were measured in 13 different laboratories. The average standard deviations of repeatability (sr) and reproducibility (sR) were 0.1 and 0.2 ‰ for δ13C values, 0.1 and 0.3 ‰ for δ15N values, 2 and 3 ‰ for δ2H values, and 0.4 and 0.6 ‰ for δ34S values, thus comparable with results of official methods and the literature for other food matrices.
3.1.8. Government Research Projects using SIRA
In addition to the above research, UK government-funded projects based on SIRA have been undertaken to address issues in the determination of geographical origin, as discussed below.
3.1.8.1. Projects FA0205 and FS515009 British Beef Origin Projects – British beef Isotope Landscape Map (Isoscape)
Given the mandatory indication of origin for beef and beef products and also concerns following the 2013 horse meat crisis, research was funded to develop methods to independently verify the declared origin of meat on Country of Origin Labels.
An existing Isotope landscape or food origin map (Isoscape) had been created for British beef in the form of a geographical origin decision making web tool with previous funding from the FSA and Defra Seedcorn development funding (ref. Q01123, later transferred to Defra jurisdiction, ref. FA0205). Project (FS515009)[20], conducted by Fera Science, used the methodology established in FA0205 (n= 292 samples from 2007-2010) to improve the robustness of the decision-making tool, which was initially limited by a lack of authentic beef stable isotope data from East Anglia, the Midlands and the South East of England.
The results from this project were used to successfully augment the beef isotope database in 2014 with 200 samples taken from East Anglia, the Midlands and the South East of England in spring and autumn (FA0152). From analysis of blind test samples to determine their geographical origin, it was possible to identify the production region to within an average distance of 202 km (standard deviation = 94). To develop the database into a tool that could be used by enforcers to check geographical origin of a suspect sample, a statistical metric was developed using chi-squared values to determine if a sample was consistent with a specific production area. In this case the model was able to correctly determine that samples were consistent with their production origin in 95% of the cases.
The database was further augmented in 2014-2015 with beef data collected from Scotland (n=250) and Northern Ireland (n=49) funded by Food Standards Scotland (FSS, Ref. FS515009)20 and conducted by Fera Science. Those isotopes that were shown to have utility in the geographical discrimination of beef were hydrogen, carbon, nitrogen, and sulfur. The isotope ratios that were most likely to be affected by seasonality were carbon (relating to feed source, particularly if grass-based feed is enriched with maize or corn during winter months) and deuterium (relating to climate and changes in temperature). The effects of seasonality were found to be minimal for all samples (from England, Scotland and Northern Ireland). The final database of 791 samples was challenged with 15 blind samples (retail samples randomly selected by FSS, country of origin was revealed to analysts post analysis). The procedure identified the 2-letter postcode regions where the samples could have originated from. The method correctly confirmed all samples were consistent with their actual postcode. It was noted though, that the isotopic profiles of several samples were also consistent with other regions of the United Kingdom. Consequently, the procedure was updated to provide options for classification, using different confidence settings. The procedure was also tested with stable isotope profiles from a limited number of these blind samples of non-UK origin, namely from Ireland, Germany, Brazil, Australia, and New Zealand. Using this procedure, Brazilian, Australian and New Zealand samples were clearly distinguished from Scotch beef. A German beef sample could be differentiated from certain parts of Scotland but not from all of them. An Irish beef sample could not be differentiated from Scotch beef. The pasture and/or fodder eaten by the cattle affect the carbon isotope signature of the meat. Plants following a C3 photosynthetic pathway have a carbon isotope signature of around -27 to -24 ‰, whereas C4 plant material possess a more positive carbon isotope signature of -16 to -9 ‰ (BBOP-FA0205). Brazilian, Australian and New Zealand samples come from cattle that are almost exclusively fed on C4 plants (maize), and therefore their carbon isotope signatures were different to those from Scottish cattle which are mostly fed on C3 plants(grass).
The stable isotope ratio analysis data set for Scotch beef was then updated and incorporated into a procedure (UK based web-tool), which was able to (i) confirm the origin of Scotch beef and (ii) detect potential food fraud in geographical mislabelling of Scotch beef, in form of a screening method. Considering that the majority of global production of beef is from the Americas (45% of Global production, 2013, FAOSTAT) this tool would offer protection from the mislabelling of beef as Scotch. Since the last data entry was in 2015, the database would need to be updated and challenged by testing more samples as this would identify any possible variations due to changes in feeding regimes or potential impacts of climate change on the isotopic signatures. The webtool would also need to be updated to support these activities. This scenario exemplifies current weaknesses in origin testing in that investment is made to prepare prediction models for the duration of a project but funding is not available to update, expand and maintain the database in a usable form.
3.1.8.2. Project FA0206 - Assessing the origin of wine using existing compositional information
The primary focus of the EU funded project entitled “Establishing a WINE Data Bank for analytical parameters for wines from Third Countries” (WINE-DB project, G6RD-CT-2001-00646-WINE-DB) was the discrimination of wine samples with respect to their geographical origin using only a few chemical parameters. In a section of the project funded by Defra, Fera Science investigated the possibility of discriminating the wines in the data bank according to their harvesting seasons and grape varieties. Several chemometric methods were selected and evaluated for this purpose. These were discriminant partial least squares, classification and regression trees, uninformative variable elimination discriminant partial least squares and neuro-fuzzy systems. With classification and regression trees, it was possible to identify a few chemical parameters including isotopic ratios (for example δ18O), biogenic amines and rare earth elements that discriminate between vintages and some grape varieties for wines produced in a particular country such as Czech Republic, Hungary, Romania or South Africa. These parameters were used in evaluating the authenticity of wines. It must be noted that this work provided proof-of-principle data only due to the small dataset and was not challenged with samples outside of the databank. This study has highlighted the complexity of defining the composition of wine by the grape variety used and by vintage. Whilst discrimination of different vintages and varieties is possible, it is clear that the variability in wine composition between vintages/varieties is not systematic, requiring different measurements to be made to define a fingerprint for each vintage and variety. This is most probably due to, amongst other factors, fluctuations in climate, variability in the manufacturing process and the sourcing of raw materials. Whilst clearly a consensus definition of a particular wine can be reached in terms of its chemical composition, this study has indicated that this is likely to be specific to the wine that is being analysed unless a wide range of measurements are made. Therefore, false labelling claims will only be identified by the continued maintenance of regulatory databases and/or the archiving of authentic materials, along with the current practice of expert interpretation of the data.
3.1.8.3. Project FA0159 - The development of isotopic and fingerprinting techniques to verify the production origin and geographical origin of food and feed (eggs, poultry and pork)
The aim of this project (2015-2019) was to build on the pilot studies undertaken (FA0130, Development of analytical methods to verify labelling claims relating to egg production) to identify markers, develop stable isotope ratio methods and build reference databases to verify age (and thereby freshness), production origin and country of origin labelling for eggs and poultry. The availability of these analytical tools would support enforcers in responding to anecdotal evidence of mislabelling of the origin of these products.
The geographical origin of feed used by poultry farms may vary, depending on cereal prices within the UK and elsewhere. Therefore, the stable isotope signatures for hydrogen, carbon, nitrogen and sulphur (HCNS) of a farm may fluctuate significantly with the change in geographical origin of the feed, which might provide a challenge for traditional stable isotope databases of solid food products (for example chicken). These are based on the data of the freeze-dried and defatted extract = protein (HCNS isotope ratios) alone, using EA-IRMS. To overcome this, this project by Fera Science took the novel approach to use CO2-eq-IRMS, which is commonly applied to aqueous samples (for example wine, juice). We measured 18O/16O isotope ratios of liquid albumen, without any sample treatment, alongside the water on the egg farms. In this way generated δ18O data allowed us to develop an egg/water model, which can detect substitution of eggs from one source with a secondary source based on a statistically robust minimum sample size, for example 12 chickens, assuming a sufficient δ18O difference between sources (for example 1.78 ‰). The advantage of this methodology is that a database of samples is not required to determine that substitution has occurred. This egg/water model can be applied to any suspect egg samples as long as drinking water from the egg production site can be sourced for confirmatory analysis. As a final stage to this work, the robustness of the model requires future investigation.
Fera Science also developed a statistical chicken (broiler) model, which uses 2H isotope ratios from farm water and HCNS isotope ratios from feed and chicken. Without the requirement of in-depth databases, substitution/product extension in the UK chickens can easily be detected. This is possible, when other feeding regimes (high maize content affecting 13C signature) compared to the UK were followed, and/or chickens were reared in warmer climates (affecting 2H signature). Our model looked at a range of substitution scenarios (from 5 to 50 %) and the sampling (6 to 300 chickens) required for detecting any substitution. It therefore allows to select a statistically robust minimum sample size for the detection of alien chickens, for example 12 chickens if 30% substitution occurs for a given isotopic difference in the population of 13‰ (2H) or 1.01‰ (13C) and this would confirm that 2H and 13C isotope ratios are most suited for detecting fraudulent activities. We recommend to conduct a small retail survey to put the robustness of the chicken model to the test. Access to feed and drinking water from chicken farm and subsequent analysis will validate the approach.
An outcome of the project highlighted that it has become clearly apparent that there is a need for governmental bodies to provide funding in order to generate and curate relevant isotopic databases, ideally as open access. The EU wine databank is a prime example of an EU membership database, where national reference laboratories receive governmental funding to contribute data. One cannot see the data gathered by other countries but can request verification by the member states of measured data, which allows detection of fraudulent wines. Therefore, government support may be needed to make progress on database sharing issues given the significant challenges that remain. The project highlighted that AHDB holds the largest and most extensive database relating to pork origin which can only be accessed via the British Meat Processors Association pork scheme. Database surveys conducted during this project highlighted that, beside this Intellectual Property issue of commercial databases, the lack of funding and the comparability of the data (standard reference materials and authenticity of samples) present hurdles for creation and sharing of data at the national, European and global levels.
3.1.8.4. Project FA0179 Review of analytical methods, Horizon Scanning and Capabilities for Food Authenticity (Food composition, labelling and Standards)
This project (2018-2022), funded by Defra and conducted by Fera Science, reviewed (i) the current landscape of tools that can be used to verify food authenticity, (ii) what approaches are used in practice within the food industry and (iii) an associated gap analysis where tools are not implemented. In addition, a review of technologies that are implemented in orthogonal sectors was undertaken to identify where already deployed analytical technologies may be sought, if required, in case of an emerging issue. A specific review into UK based commodities with protected food name status was also undertaken to determine whether the current EU system provides sufficient protection of these high value commodities and what aspects could be further developed as the UK sets up its own protected food name scheme after leaving the EU.
The review discovered areas of future considerations for ‘product specifications with geographical indication’ for the UK. In the current format the product specification describes how a product is linked to a given geography (geographical indication). EU 1151/0212 defines what information should be in the product specification, including a description of the product in question and how to maintain the record of a particular specification. One new aspect could be the inclusion of analytical measurements. This would enable screening of products in a laboratory environment with subsequent follow-up by inspectors, where specifications were not met. It was noted that expert knowledge is required to identify appropriate approaches to determine origin depending on the PGO/PDI product. Therefore, it is not possible to generally recommend a single approach; instead approaches should be considered on a case-by case-basis. Some technologies are more suitable to geographical origin verification for example stable isotope ratios, but one needs to acknowledge that isotopic signatures, representing geographical origin, are influenced by various contributing factors, for example having fed animals with feed of different geographical origin than the rearing location. Given the complexities and expert knowledge required around different commodities, it was suggested to have continued discussions with Defra and the UK GI scheme teams to ensure all aspects are adequately controlled; such as (i) achievable verification means, (ii) practicalities of auditing products (both within and external to the UK) and the (iii) desire to promote products outside of the UK for increased exports.
As part of this project, a number of UK product specifications under the EU protected food name schemes (PGO, PGI, TSG) were examined. It was noted that these specifications only included a product description, but not details on possible markers of authenticity or testing methods to verify a produce. Delegated control bodies will have access to authentic samples (produce controls) to verify that a product meets the requirements of its specification. These products command premium prices and therefore present an increased incentive for adulteration. The verification of such products is a complex task and requires knowledge of the aspects that distinguish them from conventional, non-premium equivalents. As the UK has now established its own protected food name scheme, consideration should be given for including recommended analytical tests for product authentication. Given the complexity of the challenge, this scheme may also benefit from external stakeholder engagement and involvement of industry and research providers in this field.
Stakeholder engagement during the project demonstrated that geographical origin is already an important characteristic for consumers when considering food labelling. One of the conclusions of the report was that a dedicated department or even independent body could drive best practices as a central focal point for UK protected designation food produce.
3.1.8.5. EU Framework 6 Project TRACE: Tracing (the Origin of) Food Commodities in Europe (2005 – 2009)
Defra participated in the European Research Project “TRACE”, coordinated by Central Science Laboratory (CSL, now Fera Science Ltd), with 52 participating countries. During the TRACE project, working group 1 (consisting of experts in stable isotope, trace element and strontium ratio analysis) performed analysis of light stable isotopes (H, O, C, N, S), strontium ratios and trace element profiling to monitor the uptake of these profiles from the soil, water, feed (cereal/wheat), olive oil and meat (lamb/beef). The sampling (n=20 region and year) was conducted in 11 European regions in two consecutive years.
The work at CSL was supplemented by similar analyses at a further 15 scientific institutes across Europe to provide analysis of 12,200 food, soil and water samples (over 647,000 element concentration and isotopic data points generated). The data were used to exploit current annual geological and groundwater maps to generate a European Food Origin Map. An extensive multi-element SIRA database of European honeys was also established. There is a large database of unreleased data generated by SIRA and TE for beef, lamb, chicken, wheat, honey and others. Most of this has been published as summaries and predates the review window for this report.
Concerning lamb, the results permitted differentiation of lamb meat, from most production regions, achieving a correct classification rate of 78% (Camin et al., 2007).
The geological origin of the olive oils was characterised based on the content of fourteen elements (Mg, K, Ca, V, Mn, Zn, Rb, Sr, Cs, La, Ce, Sm, Eu, U). By combining the three isotopic ratios (H, C, and O) with the fourteen elements and applying a multivariate discriminant analysis, a good discrimination between olive oils from 8 European sites was achieved, with 95% of the samples correctly classified into the production site (Camin et al., 2010).
Regarding cereals, more than 500 cereal samples collected over 2 years from 17 sampling sites across Europe and representing an extensive range of geographical and environmental characteristics were analysed. For the first time, the potential usefulness of combining Sr, C, N and O isotopic signatures, alone or with key element concentrations (Na, K, Ca, Cu and Rb, progressively identified out of 31 sets of results), was investigated through multiple step multivariate statistics.
From the classification categories compared for cereals (north/south; proximity to the Atlantic Ocean/to the Mediterranean Sea/to else; bed rock geologies) the first two were the most efficient (particularly with the ten variables selected together). In some instances, element concentrations made a greater impact than the isotopic tracers. Validation of models included external prediction tests on 20% of the data randomly selected. This allowed to study the robustness of these multivariate data treatments and to determine measurement uncertainties of the results. With the models tested it was possible to individualise 15 of the 17 sampling sites and therefore this approach demonstrated much potential (Goitom Asfaha et al., 2011).
It was investigated whether honeys produced in regions with different climatic and geological characteristics could be discriminated on the basis of the isotopic data. The H, C, N and S stable isotope ratios of 516 authentic honeys from 20 European regions were analysed. The mean hydrogen isotopic ratios of the honey protein were found to be significantly correlated with the mean hydrogen isotopic ratios of precipitation and groundwater in the production regions. Carbon isotopic ratios were influenced by climate. The sulphur stable isotope composition was clearly influenced by geographical location due to sea spray and surface geology of the production regions. The results show that the ratios of these stable isotopes can be applied to verify the origin of honey. Carbon and sulphur were identified as providing the maximum discrimination between honey samples (canonical discrimination analysis). For seven regions (Allgaeu-Germany, Lakonia-Greece, Sicily-Italy, Cornwall-England, Algarve-Portugal, Orkneys-Scotland and Poland), the percentage of correct classified samples is greater than 70%. It was concluded that the methodology in its current state can be used to provide reliable origin information. The authors highlighted that the discriminatory success of the method may be further improved in the future with consideration of combining the data with that of trace elements, polymerase chain reaction (PCR) analysis and pollen analysis (Schellenberg et al., 2010).
3.1.8.6. Technical Strategy Board Project: Authentication Scheme for novel and Premium British Food and Drink (Authentick)
In 2011 the Technical Strategy Board (TSB) funded the 'Authentication Scheme for novel and Premium British Food and Drink (Authentick) under the Nutrition for Life programme.
The very ambitious aim was to develop isoscape models for Scottish Whisky, British meat (beef, pork, lamb and chicken), British eggs and British honey, based on stable isotope data (H, O, C, N, and S). Data sets for statistical analyses (isoscape) were supplemented with results for samples from parallel funded Defra projects (FA0152 and FA0129). Sufficient data (n = 100) was only available for meat and egg commodities to allow for the isoscapes to be created and tested by close loop validation. The following success rates were achieved to correctly assign samples at post code/government office resolutions: beef 25%, lamb 5 %, pork 6 % and egg 30%. The error rate in geographical origin assignment at both postcode and government office resolution was unacceptable and is likely to be attributed to factors such as poor sample distribution (not enough sampling of production areas), and/or inclusion in the isoscape and insufficient number of samples. The data are now held in a private database by a private company.
3.1.8.7. EU Framework 7 Project FOODINTEGRITY: Ensuring the integrity of the European food chain (2014 – 2018)
FoodIntegrity was a 5-year, interdisciplinary project funded under Framework 7 and led by Fera Science, that aimed to assure the integrity of our food. The project comprised 60 participants from EU Member States, China and Argentina. The main impact of FoodIntegrity was that all stakeholders are now better informed, share best practice, have better networks, better tools, improved methods and systems for addressing food fraud and for improving consumer confidence in the food they eat. A large database was generated which encompasses numerous methods for geographical origin testing. This database is held by the European Commission Joint Research Centre (JRC) and is no longer online so was unavailable to consult for the duration of this project. The individual methods were instead captured by the extensive literature review, incorporating the range of technologies discussed in this report.
3.1.9. Limitations and gaps in SIRA analysis
As for all analytical techniques, SIRA does not always provide unequivocal information about the geographical source of a food sample. As concluded in project FS515009[21], the statistical comparison of a profile to a database of samples with a known origin can provide a very high confidence that a sample has been mislabelled (for example >95%); in other cases mislabelling may be indicated, but with lower confidence (for example 80%). Hence, users of this approach (and similar approaches) need to make their own judgement as to whether the data are inconsistent with the declared geographical origin (indicating incorrectly declared origin), often backed up with other forms of evidence such as chain of custody evidence about the appropriate response to results, which show (with a particular confidence) that a sample may have been mislabelled.
Generating and curating stable isotope databases requires a significant investment, hence the majority of such databases are not open access. For this reason, published articles report only the mean or median values for a collection of samples and not individual data points. This means that statistical evaluation of the data by others cannot be easily performed. An exception is the precipitation database of the university of Utah, which enables access to isoscapes for 18O and 2H. SIRA employs reference materials (RMs) to determine stable isotope ratios of a given sample. In an ideal scenario, these should (a) match the matrix of the sample in question and (b) be available in pairs of the same matrix to stretch the entire δ-scale of the samples to be analysed. The first steps have been made with a combined effort of multiple institutes to produce reference materials produced by the United States Geological Survey, reference numbers USGS82 to 91, which relate to (i) two honeys from Canada and tropical Vietnam, (ii) two flours from C3 (rice) and C4 (millet) plants, (iii) four vegetable oils from C3 (olive, peanut) and C4 (corn) plants, and (iv) two collagen powders from marine fish and terrestrial mammal origins (Schimmelmann et al., 2020). Zhao and co-workers produced two defatted beef reference materials, CAAS-1901 and CAAS-1802 (Zhao et al., 2019), with substantially different δ13C values (due to difference in dietary intake) which were measured and confirmed by nine international laboratories. However, δ15N values fell within the analytical error of the method (0.2 ‰) and therefore, these materials on their own are not suitable for nitrogen data scale correction. Three laboratories (Chartrand et al., 2022) assigned δ13C values for two vanillin certified reference materials, VANA-1 (-31.3 +/-0.06 ‰) and VANB-1 (-25.85 +/-0.05 ‰) and recommended to establish a third vanillin reference material close to -15 ‰ to cover the entire carbon delta scale.
It is important that robust databases are created, capturing natural variability, for example by including samples gathered across seasons and years and including pure and processed foods.
It is clear from the literature review that there is a lack of examples of food industry funding research in country of origin verification. Should industry become more involved, sampling would capture the natural variability of pure and processed foods. Also, there are few cited examples of implementing databases into working systems to monitor supply chains. There would be great benefit in this form of implementation, especially at the points of greatest vulnerability. Ideally, an infrastructure would be in place for vulnerable food commodities, with databases being prepared and maintained, in order to quickly test a given commodity to identify issues and address these issues before they become a problem.
There are areas where databases have been developed and these can easily be translated to similar products. As a hypothetical example, for corned beef there is a quick 13C test to detect South American beef in UK corned beef. Generally, South American beef cattle are fed on maize (C4 plant with δ13C range -14 to -12 ‰) and UK beef cattle are fed almost entirely on grass (C3 plant with δ13C range -30 to -23 ‰) with very little supplementation of maize. This should be reflected in the isotopic signature of the meat. The proportion of C3 versus C4 photosynthetic plant material in the cattle’s diet determines whether beef has an isotope signature of around -27 to -24 ‰ or of around -16 to -9 ‰, respectively, if cattle are almost exclusively fed on C4 plants (BBOP-FA0205). Hence, there should be no need to generate a specific database for corned beef to answer the question of whether UK corned beef has in fact been substituted for South American beef. Without requiring a database for corned beef, one could analyse a suspect corned beef sample and compare it directly to the Fera UK beef database, developed during FSA project FS515009.
An additional limitation is the lack of proficiency testing schemes for geographical origin verification. To our knowledge there are several proficiency testing schemes (for example organised by Eurofins since 1999, International Atomic Energy Agency (IAEA); FIRMS network facilitated by Axio/LGC) in place which allow stable isotope laboratories to demonstrate the accuracy and precision of measurements or ability to detect adulteration in honey (determining the C4 sugar addition, when comparing the carbon isotope ratios of honey with honey protein – AOAC 998.12 method), but none of these focus on geographical origin. Indeed, to our knowledge and from accessing the global proficiency testing providers database EPTIS, there are no proficiency testing schemes for any technology which address geographical origin. Schemes tend to be undertaken when there is sufficient participant interest when approached by PT providers. This lack of schemes may therefore highlight a very large gap in capability and/or engagement.
3.1.10. Conclusions for SIRA analysis
Overall, SIRA is still the method of choice for verification of geographical origin of food and feed, despite being costly to run compared to handheld spectroscopy devices; as long as established databases, built on stable isotope data (for example EU wine databank, AHDB pork, Italian cheeses and intellectual property (IP) from commercial laboratories), fulfil their purpose. Ensuring that databases capture natural variation is key and the data should be updated regularly to account for natural variation. Once databases are established, investment must be made to implement them into working systems to monitor supply chains, especially at points identified as being of greatest vulnerability. In most cases SIRA is combined with trace elements and/or strontium ratio analysis, followed by application of chemometrics. In future, the collaboration of analysts and statisticians will be paramount for selecting the most appropriate approach for fusing data from different analytical techniques and origins (sample or environmental factors). On the one hand, stable isotope analysts might not always possess an in-depth knowledge of statistical modelling or be aware of the required size of data sets to perform meaningful statistics and, on the other hand, statisticians might lack the understanding of the analytical data and their significance to verify country of origin.
Regarding the maintenance of databases and updating the samples on a regular basis, having a break in the collection of samples for a particular commodity does not necessarily invalidate using historical data for origin determinations. Instead, ‘old data’ should be considered and may be tested against contemporary data to examine temporal trends, or indeed to establish if data has remained stable in the intervening period. Such multivariate tests can be completed relatively easily to classify old data and vice-versa assuming the methods of analysis e.g. sample preparation and fractions analysed have remained the same. However, this would incur a delay should a method be required to address an issue which arises in the supply chain. Therefore there are benefits in maintaining databases on a regular basis.
3.1.11. Outlook for SIRA analysis
We envisage that in the first instance, non-destructive screening techniques (for example NIR spectroscopy, multispectral imaging) alongside paper trail or artificial intelligence (AI) will be used to verify geographical origin, followed by SIRA in conjunction with trace elements and/or strontium ratios including chemometrics to confirm results if necessary. This scenario allows the confirmatory methods which are more costly to run, only to be used when absolutely needed; whilst continuing to deter people from mislabelling activities through the latest screening methods.
3.2. Trace element analysis
Profiling of foodstuffs using inductively coupled plasma (ICP) – mass spectrometry (MS), ICP – optical emission spectrometry (OES) and microwave plasma (MP) – atomic emission spectrometers (AES) to determine the trace element concentrations has been used to determine / support the verification of the geographical origin of a range of food and beverages. The trace elements present within the soil depend on a number of factors including the type of rock the soil originated from, the pH, moisture content, clay content, topographic features of an area, climate, time and human activity. Elements such as rubidium, strontium and calcium which are associated with geology (rock formation, bottom of soil layers) provide valuable sources of data which are not influenced by soil geochemistry. Other elements are more linked to soil chemistry, occurring in the top layer of soil.
The elemental profile of plants is linked to that of the soil in which the plants are grown which in turn impact the elemental profile of the grazing animals. In addition to the geographical origin, the trace element composition can also be influenced by the genotype, environment and their interactions. As these factors impact the concentrations of the trace elements, it is not always possible to use this data in isolation to confirm the origin of a food or beverage. Examples of commodities for which working databases include trace elements are cotton and PDO cheeses (Grana Padano and Parmigiano Reggiano).
Katerinaopolou and co-workers published a “systematic literature review of geographical origin authentication by elemental analytical techniques”, focused on papers published between 2015 and 2019. Of the 155 papers selected as part of this review 28 described the contribution of ICP-MS as a tool to support the identification of the geographic origin. In many cases the trace element data was combined with other techniques, in particular IRMS, to demonstrate the origin. The review referenced examples of origin determination for potatoes, lettuce, peppers, tomato, onions, rice, flour, cereals, oranges, herbs, olive oil, scallops, eel, sea bass, clams, honey, lamb, milk and cheese using trace element and isotope ratio data (Katerinopoulou et al., 2020).
Applying the search criteria previously described, 47 papers were selected for review. Additional information was derived from grey literature such as instrument manufacturers application notes and other papers of interest as referenced in the selected manuscripts. The foods and beverages in which trace element analysis has been applied for origin identification in these publications are discussed.
3.2.1. Cereals
Ten varieties of wheat were cultivated in three regions of China (Hebei Province, Henan Province and Shaanxi Province). The samples (270 samples, 90 from each region harvested across three years) were randomly selected and were analysed for thirteen isotopes: 24Mg, 27Al, 44Ca, 55Mn, 56Fe, 63Cu, 66Zn, 75As, 88Sr, 95Mo, 111Cd, 137Ba and 208Pb by high resolution ICP-MS. Differences in element concentrations were observed for all except Cu and Pb. Multiway analysis of variance demonstrated that geographic origin was the most important source of variation for Mn, Sr, Mo and Cd, genotype for Ba and the harvest year for the remainder of the elements tested. The concentrations of the elements in the soil correlated with those in the wheat for Mg, Ca, Mn, Sr and Cd. Conversely, the concentrations of As and Mo in the soil and wheat were negatively correlated. There was no correlation for the other elements measured. The authors (H. Y. Liu et al., 2017) concluded that four elements (Mn, Sr, Mo and Cd) were “closely related to geographic origin” and that these elements could be used to “reliably and effectively trace wheat geographical origin”.
Zhang et al. (2021) analysed eight elements in highland Tibetan highland barley and soil from five regions across Tibet. Tibetan highland barley is claimed to be rich in nutrients and have unique health care functions (G. Yu et al., 2016) leading to the research to establish if origin can be determined. As well as trace element analysis the authors determined crude starch, crude protein and crude fibre. 126 Highland barley and associated soil samples were collected from 5 Tibetan cities and analysed by ICP-OES. Multivariate statistical analysis was carried out (principal component analysis, partial least squares discriminant analysis and orthogonal partial least squares discriminant analysis) which the authors state can “clearly classify samples from different regions”. There was a positive correlation between the zinc and iron content in the barley and the soil and a negative correlation for potassium, manganese and phosphorus (T. W. Zhang et al., 2021).
3.2.2. Cocoa
Almost 10 years ago, the International Cocoa Organization (2014) reported that cocoa beans are a top ranked commodity with their price linked to geographical origin and therefore, the ability to identify the geographic origin of cocoa beans is required to prevent fraudulent activity.
Bertoldi et al. (2016) demonstrated the application of multi element analysis to determine the geographical origin of cacao beans and cocoa products. They analysed 61 cacao bean samples from 23 countries across Africa, Asia and Central and South America using ICP-MS. 29 elements (Ag, As, Ba, Be, Bi, Ca, Cd, Co, Cr, Cs, Cu, Fe, Ga, Hg, K, Li, Mg, Mn, Na, Ni, P, Rb, Se, Sr, Th, Tl, U, Y and Zn) were included in the model derived. Where sample numbers were high enough the statistics applied reclassified all of the cocoa beans to the macro-area of origin, i.e. Africa, Asia and Central and South America (Bertoldi et al., 2016).
Chocolate, made from seeds of the cacao tree, can be contaminated with cadmium and lead by polluted soil. Lead can also contaminate the cocoa beans after harvest, potentially from dust and soil during drying of the cacao beans. There have recently been reports of high levels of these elements in dark chocolate (which contains higher levels of cacao solids). Work has separately been carried out to investigate the link between the metal concentrations in chocolate to the country of origin and in so doing determining the source of the highest levels of lead and cadmium in cacao beans. Following the analysis of 139 single origin chocolates for Ca, K, Mg, Na, P, S and the trace elements Al, As, B, Ba, Be, Cd, Co, Cr, Cs, Cu, Fe, Ga, In, Li, Mn, Mo, Ni, Sb, Se, Sn, Sr, Pb, Ti, Tl, U, V, Zn and Z, researchers (Vanderschueren et al., 2019) developed a decision tree that allowed the differentiation of chocolate from Africa, Asia Pacific, Central America and South America following Classification and Regression Tree (CART) analysis. The concentrations of all elements in the chocolate, bar Cr and V, correlated with the cacao content indicating the cacao as the source of the elements. 16 of the samples tested exceeded the European limit for cadmium for chocolate with 50% or higher cacao content. It was primarily the high Cd content in the South American samples that differentiated their origin with the addition of Mo content allowing separation from Central American samples. Cd concentrations in African and Asian Pacific samples were lower. Ba, Sr and Zn content could be used to further differentiate the samples. It was not possible to classify the Asia Pacific samples which the authors proposed to be variable soil composition and climate in the region (which included samples from India, Indonesia, Papua New Guinea, Samoa, Vanuata and Vietnam). The overall misclassification rate (all origins) was 23% based on the concentrations of the Cd, Mo, Ba, Sr and Zn (Vanderschueren et al., 2019).
3.2.3. Coffee beans
There have been numerous studies to determine the geographical origin of coffee beans mainly applying isotopic methods to determine the ratios of 11B/10B, 13C/12C, 15N/14N, 18O/16O, 34S/32S, and 87Sr/86Sr [(C. Rodrigues, Brunner, et al., 2011), (C. Rodrigues, Máguas, et al., 2011), (C. I. Rodrigues et al., 2009), (Serra et al., 2005), (Wieser et al., 2001)].
Work by Rodrigues and collaborators showed that for coffee bean samples from the Hawi’ian islands, the combination of S, O, C, N, and Sr isotope analyses with multielement analysis allowed differentiation of the different Hawai’ian coffee-producing regions. The results indicated relationships between environmental variables and the green coffee bean isotopic composition. The authors state that additional work is needed to clarify the mechanisms underlying many of these relationships, however, the results suggested that the isotopic composition of coffees from different regions may, to some degree, be predictable. If so, this would support the use of stable isotopes as a tool for the verification of coffee origin. In addition, the coffee plant seeds’ isotopes may contribute to tracing environmental impacts occurring in Hawaii, in particular if related with volcanic activity, distance to the ocean, and altitude (C. Rodrigues, Brunner, et al., 2011). However, others have reported that the isotope ratio data for Sr alone did not differentiate the samples sufficiently to determine the country of origin (H. C. Liu et al., 2014). Instead, they analysed 21 Arabica beans for B, Rb, Sr, Ba, Fe, Mn and Zn and the 11B/10B and 87Sr/86Sr isotopes. Samples, with assured origin, were obtained from Africa, America and Asia. The authors reported that the concentrations of Rb, Sr and Ba could be used to classify the origin, but it was the isotope ratio data for Sr and B that provided the more sensitive information linked to the origin.
3.2.4. Fish and seafood
Fish and shellfish represent the third most notified commodity for misdescription of origin. There has been a lot of research carried out on the authenticity of fish following reports of species substitution, however with geographical location attracting higher market prices, methodology for geographical origin determination is needed in addition to the DNA testing used for determination of fish speciation. Elemental concentrations vary greatly in the marine environment and are controlled by a combination of natural and anthropogenic activity (Rainbow, 2017).
Han et al. (2021) determined the concentrations 14 elements (Al, Ca, Co, Cr, Cu, Fe, Ga, K, Mg, Mn, Na, Ni, Sr, and Zn) in salmonids with the aim to determine the geographical origin of these species. Factors affecting the concentrations in the samples (n = 96) included harvest time (Fe, K, Na, Mn and Zn), fish size (Al, Fe, Ga, Mn and Zn) and whether the samples were freshwater-cultured or seawater-cultured (Ga, K, Mg, Na, Ni and Sr) as well as the geographical location (Al, Ca, Co, Cr, Fe, Ga, K, Mg, Mn, Na, Ni, Sr, and Zn) . Samples (freshwater and seawater) were collected from various regions in China and Chile. Analysis was carried out by ICP-AES. It has previously been reported that the elemental composition of fish is impacted by climatic variation (Mendil et al., 2010) and (Fallah et al., 2011). In this study the concentrations of Mn, Na and Zn were influenced by harvest time and climate, however, Fe and K were also impacted when harvested at different times but cultured in a controlled system. Thus, the differences observed could not be fully attributed to differences in the climatic conditions. Models developed following LDA of the data from 12 elements (Ca, Co, Cr, Fe, Ga, K, Mg, Mn, Na, Ni, Sr, and Zn) were able to predict geographic origin with an accuracy of >90%. The authors acknowledge that the study “had some limitations” based on limited sampling points, individual samples, and influencing factors and proposed that additional work is required to refine the models to increase the success rate for the identification of origin (Han et al., 2021).
More recently, the use of machine learning techniques has been applied to the determination of origin. Using data from elemental analysis Bai and co-workers were able to determine the origin of crayfish discriminating between three sites in China. 10 elements (V, Fe, Al, Ga, Co, Zn, Cs, Rb, Ba, and Sr) were found to be characteristic elements of red swamp crayfish in different regions of China (S. Bai et al., 2022).
REIMS and ICP-MS have been applied to classify salmon production methods and origin (Hong et al., 2023). Samples (n = 522) produced by two production methods (farmed and wild caught) were collected from four regions (Alaska, Norway, Iceland and Scotland). Using trace element data coupled with lipid marker data and applying chemometric modelling (PCA) with machine learning all test samples (n = 17) were correctly classified for both production method and origin. 20 elements were used to develop the model (Li, B, Al, V, Cr, Mn, Fe, Co, Ni, Cu, Zn, As, Se, Rb, Sr, Nb, Mo, Cd, Cs, and Ta). Without the fusion of data (with the lipid marker data) the accuracy of the origin determination was lower (66%). Within this article the authors also summarised the findings of other research groups using ICP-MS to determine origin including the combination of ICP-MS and NIR spectroscopy that was able to differentiate between Chilean-farmed and Norway-origin salmon (Fu et al., 2021).
A similar approach has also been applied to scallops (Morrison et al., 2019) but in this example trace element data alone was able to discriminate between the harvesting sites (3 locations on the west coast of Ireland). Ba, B, Cr, Ob, Mn, Mo and Se were the elements included in the model. The authors highlighted the importance of “periodical verification of reference chemical signatures” and that any “reference library must therefore be continuously updated to enable successful classification over time”.
3.2.5. Fruit juice
The concentrations of 25 elements (K, Na, P, Ca, Mg, Al, B, Ba, Be, Co, Cr, Cu, Fe, Li, Mn, Mo, Ni, Sb, Se, Si, Sn, Ti, Tl, V, Zn) in 36 prickly pear juice samples and the soil in which they were grown (3 regions in Greece) were determined by ICP-OES (Karabagias, 2019). There was correlation between the trace element concentrations in the soil and the fruit and the geographical origin could be predicted once the trace element data (seven elements were used in the model) was combined with the data derived from the analysis of the volatile substances (21 tentatively identified volatile substances were included in the model). The resulting model classified the prickly pear juice to the correct geographical origin with a success rate of >85%.
Multivariate analysis of trace element concentrations measured 482 in Australian and Brazilian orange juice samples showed a clear differentiation between them (Simpkins et al., 2000). Both juices and orange peel samples were analysed. Regional differences were observed in Australia linked to differences in element concentrations in the soil as well as between Australian juices and Brazilian juice concentrate samples. Ru, Ba and B contribute the main differences between Australian and Brazilian samples. Peel samples could similarly be differentiated between the two countries.
3.2.6. Garlic
Analytical screening techniques have been used to determine the composition of Chinese garlic. 34 trace elements, 68 volatile compounds and 854 metabolites were detected using a suite of analytical techniques and used to develop a chemometric model (Mi et al., 2021). A large number of parameters were required to differentiate the Chinese garlic and it was not possible, in this study, to achieve this using the trace element data alone The authors also reviewed other studies [(Camargo et al., 2010), (Ahn et al., 2019), (D’Archivio et al., 2019), (T. S. Liu et al., 2018), (Vadalà et al., 2016)] that have been carried out to determine the geographical origin (across a number of regions) of garlic and from these studies summarised that trace element profiles could, to some extent, be applied to determine the origin.
Ahn and collaborators (Ahn et al., 2019) investigated the difference between domestic garlic from South Korea and imported garlic from China for pH, moisture content, total flavonoid content, and all trace minerals except for manganese and magnesium. They used logistic regression analysis to determine the geographical origin (South Korea or China) of garlic after selecting the appropriate independent variables. As a result, the calculated logistic regression equation from the analysis of copper, iron, phosphorus, zinc, and sucrose could be used to determine whether the geographical origin of garlic was South Korea or China. In contrast, Camargo and colleagues (Camargo et al., 2010) determined the mineral content of ten garlic cultivars from Argentina and attempted to establish a relationship between the mineral contents of the garlic cultivars and their geographic origins through the use of neutronic activation analysis and PCA. Whilst they were able to demonstrate that cultivars cultivated under identical agri-environmental conditions could be categorised into groups by using just four metals as variables they were unable to demonstrate that the use of mineral profiles constituted an adequate tool for determining the geographic origin of garlic.
Similarly, in another publication (Vadalà et al., 2016) the results of the analysis of garlic samples from Sicily, Tunisia and Spain by ICP-MS were compared. Twelve samples were analysed. Despite the small sample set the authors described the differences in the metal content of the samples linked to the agri-environmental conditions in which they were grown. The main finding of the study was the higher levels of Se in the Nubia Red Garlic samples from Sicily which the authors stated “is also useful to demonstrate that Nubia Red Garlic shows important health qualities and could be used as an anticarcinogenic agent”.
3.2.7. Honey
Four papers were selected that describe the determination of the geographical origin of honey based on the elemental composition. A further article of interest was identified in the preparation of this review. Honey is another high value commodity for which fraud has been reported. Mānuka honey is a premium priced product due to its purported antibacterial properties and health benefits attributed to it. Therefore, numerous research groups have been working to define methodologies to confirm the origin of a given honey whether it be Mānuka honey from New Zealand or another source.
Grainger and collaborators carried out a study to see if New Zealand honey could be differentiated from honey samples produced in the rest of the world based on the element profile. 352 honey samples from 34 different countries were sourced from across the globe (Grainger et al., 2023). All samples were derived from nectar and a combination of mono- and multi-floral botanical origins. The samples were analysed by ICP-MS for B, Na, Mg, Al, K, Ca, Cr, Mn, Fe, Co, Ni, Cu, Zn, Ga, Rb, Sr, Cd, Cs, Ba, Hg, Tl, 206Pb, 207Pb, 208Pb. Using a decision tree approach with five terminal nodes, the New Zealand samples could be distinguished from the rest of the world with 92% accuracy. Therefore, the results show that the approach may be a promising tool to determine whether a honey sample originated in New Zealand. However, additional samples will need to be analysed and added to the database to better capture the global variation. Differences in element concentrations in honey samples prepared in different years is also needed to improve the robustness of the model.
Not all studies have been able to demonstrate the applicability of elemental analysis for origin determination when considering regions within a given country. Researchers in Uruguay (Berriel et al., 2019) analysed 25 samples of honey collected from apiaries throughout the country. Although the concentrations of the elements differed in the honey samples, the statistical analysis carried out could not differentiate between regions. This was attributed to the variability in the soils within each region.
A recent study (Pavlin et al., 2023) determined the concentrations of 18 elements in 173 honey samples covering 13 floral types and from five regions (Slovenia, Croatia, Bulgaria, Turkey, and Morocco). The main aim of the work was to assess the differences in the botanical sources, however the levels of Na, Mg, and Fe were found to be more influenced by environmental factors and so the authors summarised that these elements could be considered as markers of geographical origin.
Spark discharge-assisted laser-induced breakdown spectroscopy was used (Fechner et al., 2021) to analyse 49 composite Argentinian honey samples. The samples were harvested across two years (2015 and 2016). Ca, K, Cu, Fe, and Mn were reported to be important indicators for the geographical origin of honey.
Bracatinga honeydew honey is exclusively produced in three regions of Brazil. Thirty nine elements were analysed in 34 Bracatinga honeydew honey samples (Silva et al., 2021) obtained from across the three regions (Santa Catarina, Paraná, and Rio Grande do Sul). All samples were harvested in 2018. Statistical analysis (cluster analysis, principal components analysis, and linear discriminant analysis) allowed for the differentiation of the honeys from the different regions with 91% accuracy with Rb and Co highlighted as the primary elements of interest.
3.2.8. Meat
There are many reports where trace element data has been used to support the development of models for the determination of geographical origin of beef, pork and lamb. For example, the group of Fernandes analysed beef samples from various regions in Brazil for their elemental composition with the aim being to develop an internationally accepted traceability system for Brazilian beef products. Neutron activation analysis was used. Differences could be observed in the data sets obtained from the different regions using the three machine learning algorithms applied (Multilayer Perceptron, Random Forest and Classification and Regression Tree) and multivariate statistics (E. A. D. Fernandes et al., 2020).
The Chinese Ministry of Agriculture identified Yanchi Tan sheep as an “Agro-product geographical indications” in 2008 and “Chinese Protected Designation of Origin” by the General Administration of Quality Supervision. Due to an increase in fraudulent activity methodology was needed to be able to identify the Tan sheep from the Yanchi region. Liu and co-workers (H. Y. Liu et al., 2021) collected lamb samples from across China and subjected them to multi-element and stable isotope analysis. Feed and soil samples were similarly analysed. The samples collected were from 2017 and 2018. Correlation was seen between the trace element concentration in the lamb samples and the soil. The authors reported differences in the stable isotope ratios as well as the elemental profiles among the different regions the samples were collected from. Concentration of Sr and Mo along with the stable isotope data could differentiate the Yanchi Tan lamb from other regions, however, further work is needed to consider lamb from other geographical locations to obtain a robust model.
Another study analysed pork belly fat samples from USA, Spain, Canada, Germany, Mexico, Chile and South Korea. The concentration of 19 elements (Al, Ca, Fe. P, K, Na, S, Zn, As, Cd, Cs, Cr, Li, Mn, Ni, Rb, Se, Sr and V) were determined along with the isotopes of the trace elements: 6Li, 7Li, 116Cd, 50Cr, 51V, 52Cr, 53Cr, 55Mn, 58Niu, 75As, 85Rb, 82Se, 84Sr, 85Sr, 87Sr, 88Sr and 133Cs were measured. Linear discriminant analysis and principal component analysis were applied to the data sets for the 480 pork belly fat samples tested (350 domestic and 130 imported). Domestic samples could be differentiated from imported samples with a discrimination rate of 98%. The authors concluded that trace element concentrations and isotope ratios are “promising descriptors which may be used to reasonably discriminate the geographical origins of the pork samples” (Nho et al., 2019).
The use of trace elements concentrations to determine the origin of pork from different regions of China was reported by Qi and collaborators. All seven regions studied (10 samples were collected from each region) had a distinct element profile. Data from 11 elements (Ca, Mn, Fe, Cu, Zn, Se, Rb, Sr, K, Na, and Mg) was used in the statistical analysis. The authors acknowledge that the data set was small and requires further input data to refine the models to generate a comprehensive method for origin determination (Qi et al., 2021).
The group of Varrà reported the development of a “promising method to confirm the declared pig meat label attributes, deter potential complex fraud, and support meat traceability systems”. 80 pig muscle and 80 pig liver samples were analysed by ICP-MS for 57 elements. Three sets of the samples consisted of meat from the PDO Parma Ham production process and a fourth group that were not compliant with the Parma Ham production process. Samples could be separated using the multivariate statistics applied but additional work is proposed to refine the models into useful traceability tools (Varrà et al., 2023).
Further examples of the use of element profiling to successfully verify the geographical origin have been published. For example (Rees et al., 2016), assessed poultry samples, mainly chicken, from Argentina, Brazil, Chile, China, Thailand and Europe (Austria, Czech Republic, Slovakia, Denmark, France, Germany, Hungary, Ireland, Italy, Netherlands, Poland and the United Kingdom. The study used stable isotope and elemental analysis, together with statistical processing of the resultant data to determine the geographical origin of the poultry. Interestingly the carbon stable isotope ratios of chicken meat indicated the quantity of maize in the diet leading to a useful discrimination between a large proportion of European poultry and poultry reared in locations such as South America, Thailand and China where maize feeding predominates. It was stated that the use of poultry carbon isotope values as a simple ‘screening’ parameter to differentiate European poultry meat from other major importers was not as reliable as for the differentiation of European and South American beef. However, carbon isotope ratios would be useful in most instances to corroborate suspicion of mislabelling of non-corn-fed European poultry. A study by Heaton and co-workers (Heaton et al., 2008), analysed beef samples originating from the major cattle producing regions of the world (Europe, USA, South America, Australia and New Zealand) by IRMS and ICP-MS and was successfully able to discrimination between beef samples on the basis of the broad geographical areas (Europe, South America and Australasia). Although the authors included the caveat that the methodology, in its current state, could be used to provide reliable origin information, but that it was dependent upon the countries under investigation.
Kim and collaborators (Kim et al., 2017) presented a study aimed at determining the concentrations of twenty-nine elements in 323 pork belly samples including 227 domestic (from Suncheon, Naju, Chungju, Gangjin and Yongin cities of South Korea), and 96 samples imported from USA, Germany, Austria, Netherlands and Belgium. The macro elements including Al, B, Ca, Fe, K, Mg, Na, P, S, and Zn were analysed by ICP-OES, whereas trace elements including Ba, Be, Bi, Cd, Co, Cr, Cu, Cs, Ga, Li, Mn, Ni, Pb, Rb, Se, Sr, U and V were analysed by ICP-MS, with multivariate data analyses of PCA and LDA. It was found that analysis of the trace elements were a promising approach for determining the geographical origins of the pork, where a discrimination index of 97% differentiated pork originating from different countries.
Finally, in a small single country study, (S. Sun et al., 2011) researchers were able to successfully discriminate between mutton samples from agricultural and pastoral regions in China with 100% accuracy using ICP-MS analysis of Be, Na, Al, Ca, V, Cr, Mn, Fe, Co, Ni, Cu, Zn, As, Se, Ag, Sb, Ba, Tl, Pb, Th and U (S. Sun et al., 2011).
3.2.9. Olive oil
Another high value commodity subjected to fraudulent claims is olive oil. Two papers were identified that described the use of multi-element profiling to determine geographic origin.
In the first (Beltrán et al., 2015), trace element profiles from Spanish olive oil and corresponding pomace from several different cultivars along with the corresponding soil samples were analysed for 34 elements. The olive oils were characterised by the levels of W, Fe, Mg, Mn, Ca, Ba, Li and Bi. The impact of fertiliser and fungicide used on the levels of the trace elements in the resulting olive oil needs to be investigated prior to standardising the methodology for the origin determination of Spanish olive oil.
In the second study (Telloli et al., 2023), triple quadrupole ICP-MS analysis was used to characterise extra virgin olive oil samples from Italy (24 samples collected over a 6-year period). Concentrations of Be, B, Na, Al, P, K, V, Cr, Mn, Fe, Co, Cu, Se, Ag, Cs, Tl, Pb, Th, U, Mg, Ca, Ni, Zn, Ga, As, Rb, Sr, Cd and Ba were determined. PCA was applied to the datasets to provide a classification according to the region of origin, which the authors claim to be a starting point for continuing work with larger samples sizes to enhance the model.
3.2.10. Rice
Basmati rice is grown in a specific area of the Indo-Gangetic Plains. The properties and flavour of the rice cannot be replicated in other areas. As Basmati rice attracts a premium price there have been a number of issues with authentication over the years and methods have been developed to support the confirmation of the product. Arif et al. (2021) “assessed the application of elemental analysis for the authentication of the geographical origin of Basmati rice”. 64 samples (21 of known authenticity) were analysed by ICP-MS. Statistical analysis was performed using the concentrations measured for 35 of the elements tested (23Na, 25Mg, 26Mg, 27Al, 29Si, 31P, 33S, 39K, 43Ca, 44Ca, 45Sc, 47Ti, 52Cr, 54Fe, 55Mn, 56Fe, 59Co, 60Ni, 65Cu, 66Zn, 69Ga, 72Ge, 75As, 76Se, 79Br, 81Br, 85Rb, 88Sr, 89Y, 95Mo, 137Ba, 139La, 140Ce, 197Au and 206Pb). Chemometric analysis (data driven soft independent modelling of class analogy - DD-SIMCA) identified eight elements (44Ca, 66Zn, 72Ge,75As, 76Se, 88Sr, 95Mo, 206Pb) as the key discriminators to confirm that the origin of the rice was the Basmati region. The authors reported that the “sensitivity and specificity of the one class DD-SIMCA model were 100% and 98%, respectively”. They also highlighted the value of coupling the elemental analysis with other techniques, for example isotope ratio mass spectrometry to further refine and improve the model (Arif et al., 2021).
High resolution ICP-MS was used (Cheajesadagul et al., 2013) to establish a model to confirm the geographical origin of Thai rice. 31 Thai jasmine rice and 5 other samples were tested with the concentrations of 21 elements measured. The multivariate analysis differentiated the samples and could also differentiate the Thai jasmine rice samples by region (northern, north-eastern or central regions of Thailand).
A recent study identified 14 elements as markers for the geographical origin of rice (Quinn et al., 2022). Samples (n = 151) harvested in 2018 and 2019 in Vietnam, China and India were analysed by ICP-MS and the concentrations of the elements measured were subjected multivariate statistical assessment. The data showed the Chinese samples had higher concentrations of Ca, Al and Mn. Vietnamese samples had high Zn concentrations and low Ge concentrations and the Indian samples contained higher levels of B, Sr, Se, Cu, Mo, Co, W, Fe and Ti. The differences were “attributed to varying elemental compositions intrinsic to the soils from which they were grown”.
Bui and collaborators looked specifically at rice from Vietnam to establish if the different within country regions where the rice was grown could be differentiated. The authors described a linear discriminant analysis and partial least squares-discriminant analysis model that separated Sengcu rice from other regions. As, Ba, Sr, Pb, Se, Ca differentiated Sengcu rice from other regions with 100% accuracy (Bui et al., 2022). A study investigating the geographical origin determination of hot pepper also reported that the 87Sr/86Sr ratios in rice could be linked to the those of the water and exchangeable fraction of soil (Song et al., 2014).
Work by Maione and colleagues differentiated rice grown in the midwest and southern regions of Brazil using statistical models developed using the elemental composition of the rice 31 samples analysed. Cd, Rb, Mg and K concentrations were the primary discriminators (Maione et al., 2016).
70 rice samples and 35 topsoil samples (from the same region of Brazil) were analysed by ICP-MS. Arsenic concentrations were determined in the rice samples by HPLC-ICP-MS to provide total and species concentrations. Others have reported that the variability in the data was linked to “the geographical area, to crop management, producers and in a lower extent to soil composition” (Lange et al., 2019). There was no link between the data for the rice and for soil (applying PCA analysis). The authors suggested the inclusion of stable isotope data would provide a “more robust dataset for traceability”.
In a 2016 study looking at the stable isotopes of Cs and Sr as markers for radioactivity Srinuttrakul and Yoshida reported that Cs and Sr concentrations in Thai rice were higher than those for Japanese rice (Srinuttrakul & Yoshida, 2016).
3.2.11. Saffron
As the most expensive spice on the market there have been a number of incidences of saffron fraud in recent times. One example of such fraud is the “rebranding of cheaper growing regions to be passed off as regions of higher quality” (Wakefield et al., 2019). These authors determined the concentrations of 42 elements (Li, B, Na, Mg, Al, K, Ca, V, Cr, Mn, Fe, Co, Ni, Cu, Zn, As, Se, Rb, Sr, Y, Mo, Ag, Cd, Sn, Sb, Cs, Ba, La, Ce, Pr, Nd, Sm, Eu, Gd, Dy, Ho, Er, Tm, Yb, Lu, Pb, U) in 41 Iranian saffron samples and 9 Spanish saffron samples by ICP-MS. In addition, stable isotope analysis was performed. Application of LDA allowed the saffron from Spain to be differentiated from that grown in Iran. The authors also found differences in the profiles year on year and therefore, additional work is required to build on these databases and so better refine the models developed.
3.2.12. Tea
Elemental profiling of Indian tea has been demonstrated (Vinay Jain, 2022) to identify the growing region of the teas within India. Tea samples (n = 150) were obtained from eight regions and the concentrations of B, V, Cr, Co, Ni, Zn, Se, Rb, Sr, Mo, Cs, Ba, Mg, Mn, Al, La, Ce, and Nd were measured by ICP-MS. The application of principal component analysis showed significant variation using data derived from 18 of the elements tested. According to the PCA loading values, the separation between the geographical origin of tea was driven by Sr, Ba, and B for principal component 1; Cs, La and Rb for principal component 2; and Mo, Ce, and Nd for principal component 3. When testing the model with unknown teas, the origin of all 24 were correctly classified.
3.2.13. Tomatoes
As the major tomato producer in Europe, Italy has a commercial interest to be able to demonstrate the authenticity of the fruit grown on its land. A study from 2018 measured the concentrations of 26 elements (Li, Be, Na, Mg, Al, K, Ca, V, Cr, Mn, Co, Cu, Zn, Ga, As, Rb, Sr, Ag, Cd, In, Cs, Ba, Tl, Pb, Bi and U) in 183 tomato-based products originating from Italy, China, US and Spain. Sampling took place in 2013, 2015 and 2017. A total of 169 samples were analysed using Inductively Coupled Plasma orthogonal acceleration Time-of-Flight Mass Spectrometry (ICP-TOF-MS). Multivariate statistical analysis was applied to the data to generate a model to discriminate Italian tomatoes. Three element ratios were identified as being able to identify the Italian tomatoes (Li/Cu, Co/Rb, and Sr/Cd). The dataset was tested against 14 samples described by the authors as “Italian sounding” tomato passata of unknown origin obtained from American companies. Two of the 14 samples where well separated from Italian tomatoes in the statistical model (Fragni et al., 2018)
3.2.14. Whisky
Most studies to determine the geographical origin of whisky have involved GC, LC and spectrophotometric techniques. The use of multi-element data to determine geographical origin of whisky has not been extensively reported. Some authors (Adam et al., 2002; Pawlaczyk et al., 2019) using ICP-MS reported that their studies could not differentiate between samples of different geographical origins using multi-element data, although Irish whiskey could be distinguished from whisky samples from other countries as it was characterised by quite a high amounts of Ba and Ti. The authors suggested that their analyses were confounded by a small number of samples from each region and the type of whisky (single malt or grain vs blended malt or grain). The authors suggested that future studies should include a greater number of samples originating from different regions of Scotland and countries and from the same distillery.
Gajek and collaborators investigated the potential to use element concentrations to differentiate samples including according to their origin. 170 whisky samples from 11 countries (Scotland, the USA, Ireland, Poland, Japan, the UK, India, Azerbaijan, Slovakia, Wales, and Bulgaria) were analysed by ICP-MS for Ag, Al, B, Ba, Be, Bi, Cd, Co, Cr, Cu, Li, Mn, Mo, Ni, Pb, Sb, Sn, Sr, Te, Tl, U, and V, ICP-OES for Ca, Fe, K, Mg, P, S, Ti, and Zn and Cold Vapor-Atomic Absorption for Hg. The authors summarised the characteristics of the whiskies produced in USA, Ireland, Poland and Scotland. The authors reported that the whisky from the USA was characterised by the highest median values of Li, Be, V, Mn, Ag, Sb, Zn, P, Fe, and Ti when compared with samples from other countries. Scottish samples contained the highest median levels of Cu and Cd. They concluded that “the observed differences only prove that samples from various countries have completely different elemental fingerprints” (Gajek et al., 2022).
Nelson et al. (2017)[22] analysed 69 samples (16 Bourbons, 8 Irish, 9 Japanese, 1 Rye, and 2 Tennessee whiskey products and 33 Scotch whisky products) by Microwave Plasma-Atomic Emission Spectroscopy (MP-AES). Concentrations of Al, Ba, Ca, Cu, Fe, K, Mg, Mn, Na, Rb, Si, Sr and Zn were determined to be statistically different. Differences in Cu concentrations were proposed to be linked to processing conditions rather than the origin as copper is used in the distillery process.
3.2.15. Wine
As a commodity for which there have been numerous reports on counterfeit products then it is not surprising that the literature review identified the most publications on wine.
Cellier et al. reported the use of 87Sr/86Sr and 208Pb/206Pb ratios to differentiate between champagne and other sparkling wines from across the globe. Samples included Spanish Cava, Italian Prosecco, sparkling wines from USA, Brazil, Chile, China, Argentina, South Africa, England and Australia. Analysis was by Multicollector-ICP-MS which allows for other isotopes to be measured. The authors reported differences in the Sr an Pb isotopes in champagne compared to other sparkling wines and so proposed that this combination could be used to discriminate the geographical origins and so support authenticity testing in the future (Cellier et al., 2021).
A recent article reported the use of ICP-OES and ICP-MS to determine trace element concentrations in soil, grapes grown on the soil and the wine produced by the grapes. Chemometric analysis was carried out on the data obtained. It was found that the use of bentonite clays in processing of wine (as fining agent) transfers rare earth element into the wine, influencing the trace element profile and so, in this study the geographical origin could not be differentiated based on the elemental profile (Temerdashev et al., 2023).
Similarly, a previous study reported the influence of processing parameters as well as agricultural practices and pollution, on the elemental profile of, in this case, Turkish wines. ICP-AES and ICP-MS analysis of 13 wines was carried out and the limited sample set meant that it was not possible to fully assess the data statistically and so no conclusions as to the impact of geography on the element profiles could be ascertained (Sen & Tokatli, 2014).
Conversely, through the analysis of white wine throughout the production chain it was observed that there was no impact of the use of bentonite on the Sr ratios which remain consistent with those of the rocks and soils, suggesting that no further contribution is given by the addition of bentonite and yeast to the white wine Sr-isotope values. The Sr ratios in both the grapes and final wines preserved the isotope signature derived from the labile fraction of the soil where the vines were farmed (Tescione et al., 2020).
Pepi and Vaccaro studied the elemental concentrations of Prosecco wines and compared the data with that of the soil on which the vines were grown and grapes harvested. Geochemical and statistical analyses were able to discriminate the vineyard soils according to the geo-lithological characteristics of each area and to identify geochemical “Prosecco” fingerprints. The authors claim that these fingerprints could be used against fraudulent use of DOC wine labels (Pepi & Vaccaro, 2018).
In a study of 639 Italian wine samples analysed for their elemental composition, Chianti and Chianti Classico wines from Tuscany in Italy were compared with another 18 geographical regions but it was not possible to completely discriminate between the samples tested due to the close proximity of the regions from which samples were taken (Bronzi et al., 2020).
Martins and co-workers studied the Sr isotope ratios in Portuguese vineyard soils. Granite based soils showed higher 87Sr/86Sr ratios than the other soils (sedimentary formations). Although wines were not analysed in this study the authors proposed that an international databank of 87Sr/86Sr values should be set up to support geographical origin determination (Martins et al., 2014).
Authentic Bordeaux wines were shown to have a narrow range of 87Sr/86Sr ratios and Sr concentrations and so were reported to be suitable indicators of the geographical origin of these prestigious wines. The data was compared to that derived for 17 red wines, purchased in China, which included some which were labelled as from Bordeaux, but with labels that led to questions around the authenticity of the products. The results demonstrated a relatively narrow span of variabilities for the 87Sr/86Sr ratio and Sr concentrations in authentic Bordeaux wines. In contrast, the Chinese wines results for 87Sr/86Sr ratios varied widely and contained approximately 4 times the concentration of Sr. The authors postulate that the unique Sr binary signature may detect imitated wines and trace genuine products from different regional wineries. Together with the soil, the authors highlighted that climatic conditions, agricultural practices, management of wineries and wine-making techniques may all also impact the Sr ratio and total concentrations in the wine. Again, the need to collate data to support further model development was proposed (Epova et al., 2019).
Other research groups have published studies combining stable isotope and element concentration data to support the determination of geographical origin, highlighting the importance of combining datasets to achieve more suitable models to discriminate between locations. Rapa et al. (2023) described the analysis of seven Venetian wines for 63 elements and six isotope ratios with As, Ca, Cs, δ11B and 87Sr/86Sr being the most informative. (Rapa et al., 2023). Wu et al. (2021) looked at French red wines in the same way but identified Mg, Mn, Na, Sr, Ti and Rb as the most suited to French wine regional traceability (Wu et al., 2021).
Saar de Almeida et al. (2023) reviewed the literature on the use of Sr isotope ratios to characterise wines according to their geographies. The method is based on the principle that the Sr isotope ratio in wine reflects that of the labile fraction of the vineyard soil from which the wine is produced and so can be used to determine geographical origin. The authors conclude that “Although limitations are evident when implemented at large (global) scales, we demonstrate that the 87Sr/86Sr isotope tracing technique remains a powerful and reliable tool for determining the geographical origin of wine when combined with detailed knowledge of the geological and soil characteristics of the substrata” (Saar de Almeida et al., 2023).
A vendor’s application note from 2015[23] reported differences in the trace element profiles of wines from different regions, 75 red wines produced in different regions of Italy and from various grape types were analysed by ICP-MS measuring concentrations of 39 elements. PCA was performed on the data. Wines from Puglia were contained higher levels of Cu, Sb and Pb but it was not confirmed whether this was linked to soil type, grape type, or cultivation and production methods. The authors reported that the high levels of Cu maybe due to the use of copper compounds as mildewcides and fungicides or could be a consequence of the use of brass equipment during production and bottling. Wines from Tuscany had higher levels of Sr and Li. Trentino wines had higher levels of several rare earth elements. Overall differences were observed but the authors concluded that “further research will be needed to investigate whether these differences relate to soil and rainfall or are correlated to viniculture production differences”.
An Agilent Technologies Application note authored by Nelson et al. (2015)[24] studied the trace element concentrations of Malbec wines from Argentina and the USA. Analysis was carried out by MP-AES. Sr, Rb, Mg, Ca, Na, and K concentrations in 41 wines from the two regions (26 from the Mendoza region of Argentina and 15 from California in the US) were determined. In-house chemometric software (Mass Profiler Professional) was applied to the data generated followed by Partial Least Squares – Discriminate Analysis (PLS-DA). The statistical analysis differentiated Malbec wines from Argentina and the US, with 14 out of 15 US samples and 25 out of 26 of the Argentinian wines correctly classified.
3.2.16. Cheese
As mentioned in Section 3.1.7, a working example of application of analytical techniques to authenticate PDO cheeses is reported by Camin et al., 2015. In an international collaborative study and in combination with SIRA measurements, the content of Li, Na, Mn, Fe, Cu, Se, Rb, Sr, Mo, Ba, Re, Bi, U in cheese after acid microwave digestion using Inductively Coupled Plasma Mass Spectrometry or Optical Emission Spectrometry (ICP-MS or -OES) was measured in 13 different laboratories. For elemental data, the average RSDr and RSDR values ranged between 2 and 11% and between 9 and 28%, respectively, consistent with methods reported by the FDA and in the literature for cheese.
3.2.17. Limitations and gaps in trace element analysis
Examples of the use of trace element data to determine the geographical origin of foodstuffs has been presented with over 50 papers and review articles being summarised above. As expected, the size of the sample set and confidence in the origin of the samples both influence the quality of the prediction models produced. In many of the examples considered it was a combination of techniques that allowed the geographical origin to be determined and so continuing to derive measurement data for authentic samples of high value foodstuffs and refining the models developed is essential to protect against future fraudulent claims as to the origin of these products. Many of the papers compared one region with another rather than being an indiscriminate tool to determine the origin of the sample when considering the potential worldwide source. While using trace element data, one needs to consider the relationship of geochemicals with the soil, and particularly the transfer factors and the pH of the soil which can influence the stability of a signal, as can the application of fertilisers. Finally, almost all the authors highlighted the need to continue to add trace element data to the databases developed to allow the impact of season by season and year on year variability to be fully addressed as well.
3.2.18. Conclusions for trace element analysis
In conclusion, the publications considered within this literature review demonstrate that trace element data can be used to support the determination of geographical origin. However, due to the limited data available the examples provided generally compare two distinct locations and would all benefit from more extensive data sets. At the present time trace element analysis alone cannot be considered to be an indiscriminate tool to determine the origin of the sample when considering the potential worldwide source.
3.2.18. Outlook for trace element analysis
Continued data collection to support the development of more extensive and robust databases covering more regions, addressing uncertainties of season by season and year on year variability is required to develop this tool further. In addition, the strength of using multiple complementary techniques to verify origin has been highlighted and so coupling trace element data with proteomic, genomic, stable isotope or other data is expected to improve the models developed to date.
3.3. Metabolomics and profiling techniques
Metabolomics is the large-scale study of small molecules, commonly known as metabolites, within biological materials. The papers cited in this section were selected through the filtering process to cover analytical technologies enabling metabolite profiling studies for the determination of geographical origin of food. The use of these technologies and their application to food commodities is discussed along with the application of data analysis techniques such as multivariate statistics, artificial intelligence (AI) and organisational workflows such as metabolomics. This section focusses on technologies that measure the molecular composition of foodstuff and use the variability in composition to assess geographical origin. This usually relates to specific compounds (markers) or groups of compounds (fingerprints). This section does not include isotope, protein or DNA based approaches which are covered elsewhere and largely focusses on small molecules, although some techniques discussed will register signals from more complex molecules such as oligosaccharides and triacyl-glycerols.
A comprehensive review of the analytical methods used for the determination of the geographical origin of food (Luykx & van Ruth, 2008) categorised methods into four groups: mass spectrometry techniques, spectroscopic techniques, separation techniques, and other techniques. The authors conclude that a combination of methods analysing different types of food compounds seems to be the most promising approach to establish geographical origin, recognising the need for statistical approaches to bring together the data.
However, separation techniques are frequently combined with mass spectrometry and sometimes with spectroscopic techniques such as NMR spectroscopy. Gas chromatography coupled with mass spectrometry has been extensively used for volatile profiling, whilst liquid chromatography coupled with high resolution mass spectrometry (LC-HRMS) provides a rich source of information for assessing the non-volatile components of food. Other hyphenated technologies such as gas chromatography coupled with ion mobility spectroscopy have shown recent promise (Zhu et al., 2023). The holy grail for origin determination is field portable and handheld devices that can be readily used by unskilled operators. Full integration of rapid online testing for food authenticity within the food supply chain has recently been discussed, capturing the most promising approaches for a fully integrated system, whilst highlighting challenges, particularly in developing countries (McVey et al., 2021).
3.3.1. Targeted vs non-targeted
Analysis methods for food origin determination can be described as targeted or non-targeted the merits of which have been discussed (McGrath et al., 2018). Targeted methods are those which seek to measure a specific component of a food, for example, pesticide residues or nutritional components such as iron or potassium. Targeted methods are often preferred as they do not rely entirely on databases of reference information to draw a conclusion, particularly in well studied food stuff. A good example is the use of markers from the nectar of specific flowers to determine the origin of honey (see honey later). Local flora is often specific to a particular country or region and it is often possible to use nectar markers to determine the origin of honey, with the most high profile example being the New Zealand governments use of 4 biomarkers to define authentic New Zealand Mānuka honey.
As origin determination is often complicated by factors that impact on the uniqueness of markers, non-targeted methods are also often employed. These methods measure the composition of a foodstuff through a “fingerprint” or profile. This can be a spectral profile which carries the signature of the product as a whole (for example UV or IR spectroscopy). These technologies rely on having an established and robust database or training set of samples that can be used to determine the success with which the origin of a test sample can be determined. As such, they rely heavily on the use of chemometrics and machine learning (X. L. Zhang et al., 2021) and often are referred to as black-box approaches, as the basis on which origin classification can be achieved is often unclear. More sophisticated non-targeted approaches such as LC-HRMS and NMR spectroscopy also provide data about components of a foodstuff that can be used for origin determination. This is usually a combination of several (often hundreds) of biomarkers that in combination are characteristic of a particular geographical source. The rationale for using these approaches may be based on local climate, geology or practices such as feedstuff and fertilisers.
Having access to meta data such as permissible additives, clarifying agents, yeasts and knowledge of permissible production processes is very valuable information, when performing targeted or non-targeted profiling of food. Taking the example of the EU wine databank, if information is available relating to the different approaches for populating the databank with authentic micro-vinified wine samples versus mass produced retail samples, signals can be detected to determine if using steel tanks or wood chips have been used during vinification, to support origin verification.
Whilst rarely conclusive on their own, non-targeted methods can provide highly indicative data to support a weight of evidence approach for origin determination, particularly when taken with the results of other analytical tests and data from traceability systems.
3.3.2. Multivariate analysis, data fusion and Omics
Data from both targeted and non-targeted approaches can be used in multivariate analysis and data fusion approaches. Multivariate analysis as the name suggests, relates to the statistical analysis of data with more than one variable. Often many thousands of variables are used as input to a multivariate analysis with data from many hundreds of samples resulting in complex data matrices. Tools that are often employed to simplify these data include principal components analysis and machine learning.
Data fusion seeks to take data from the same samples and join them to improve the information content of a dataset. This is a logical approach used routinely in assessing the origin of foods as most analysts will use multiple data sources when drawing conclusions. However, data fusion is usually used to cojoin data from different analytical techniques that are then further subjected to some kind of multivariate analysis. An example of this is the classification of salmon from 5 different regions using a dual-platform mass spectrometry data set. Eighteen robust lipid markers and nine elemental markers were found, which provided robust evidence of the provenance of salmon. This study showed that a data fusion - multivariate analysis strategy greatly improved the ability to correctly identify the geographical origin and production method of salmon (Hong et al., 2023).
The combination of non-targeted analysis with multivariate techniques often attracts an “omics” suffix as is the case with when using metabolomics workflows (Mialon et al., 2023). This has been suggested to be an effective way of marketing products with specific Geographical Indicators (GI) (Cassago et al., 2021). The following sections describe published studies from the aforementioned literature review process. Each section describes how analytical methods have been combined with multivariate statistics and AI to address the reporting of the origin of food.
3.3.3. Edible Oils
Analytical methods that have been used for geographical origin determination of edible oils have been robustly reviewed (Tahir et al., 2022). A systematic review of papers published between 1 January 2013 and 15 December 2020 identified sixty-six full text articles that met the selection criteria for inclusion in the review, which required the use of both analytical techniques and multivariate analysis. The authors concluded that geographical origin was a major source of the variation in oil composition. Targeted analysis of the following components was frequently carried out in oils: phenolic compounds, fatty acid profile, sterols, triacylglycerol (TAGs), volatile compounds and colour. The most popular approach (half of the papers) measured the elemental composition of the oils using, for example, inductively coupled plasma mass spectrometry (ICP-MS). A trend towards the use of NMR, IR spectroscopy and chromatography was noted. The predominant oil types reported to be analysed were olive oil (including virgin and extra virgin) mainly from Europe and North Africa. Several studies also considered palm oil from diverse locations in Asia and Africa (Pérez-Castaño et al., 2015), (Ruiz-Samblás et al., 2013), (Tres et al., 2013), (Jolayemi et al., 2018). Multivariate analysis techniques that were employed included principal components analysis (PCA) and partial least squares (PLS) regression, usually coupled with LDA or DA to predict geographical origin based on the input variables from analytical testing. A wide range of multivariate analysis techniques have been assessed to determine their applicability for the analysis of data from oils (Avramidou et al., 2018) with novel AI approaches proposed (artificial neural networks, fuzzy logic, expert systems, decision trees, support vector machines). Similar approaches have been reported for geographical origin determination using FTIR applied to Greek olive oil (Soh et al., 2023).
Using FT-NIR and headspace gas chromatography ion mobility spectrometry (HS-GC-IMS), methods were developed to reveal the authenticity of Slovenian olive oil. FT-NIR offers in-field testing capability, with high sample throughput, low operational costs, requires little or no sample preparation, and no need for chemicals or specialized laboratory facilities. The research carried out by the Joint FAO/IAEA laboratory was designed to verify the origin of Slovenian extra virgin olive oil from the Istria region, which has a protected designation of origin and is a high value product (Kelly & Midgley, 2024). More specifically, the project aimed to support ‘made in Slovenia’ branding regulation so a model was developed as Slovenian/non-Slovenian using OPLS-DA and SIMCA. A total of 64 authentic extra virgin olive oils were used in the study, collected over two years from Slovenia, Italy, Croatia, Greece, Tunisia and Spain as part of an IAEA collaborative research initiative with these countries. Scientists using the method were able to tell the difference between extra virgin olive oil from Slovenia and other countries with between 86 per cent and 93 per cent accuracy, after screening and processing the data obtained. Scientists at the FAO/IAEA laboratories also used Fourier transform infrared spectroscopy with attenuated total reflectance (FTIR ATR) to accurately discriminate olive oil between different regions of Lebanon, due to olive oil quality and value varying across different regions of Lebanon.
It was noted by Tahir and co-workers (Tahir et al., 2022) that using multiple data sets from different analytical sources generally improved classification rates and therefore the robustness of origin determination, however all studies presented required bespoke reference data with sample numbers varying considerably.
The success of the discrimination between oils from different geographical locations was heavily dependent on the analytical technology employed and the sample set. Many technologies have been applied to determine the origin of oil including a wide range of spectroscopies and separation technologies. Some success has been achieved using technologies such as electronic noses and tongues although the nature of the technologies means that the data can be difficult to interpret due to low information content (Melucci et al., 2016), (Haddi et al., 2013), (Souayah et al., 2017). The review concludes that the most successful methods for determining the origin of oils are multi-analyte methods when combined with multivariate analysis.
The use of NMR for the analysis of olive oil has been reviewed and points to this approach being amongst the most promising for geographical origin determination of extra virgin olive oil (Maestrello et al., 2022). The authors highlight that the major advantage of the approach is the ability to detect different types of fraud and to perform quantitative measurement of oil components from the same analytical data. Liquid and gas chromatography coupled with TOF mass spectrometry has been evaluated for the discrimination of Mediterranean extra virgin olive oil (Olmo-García et al., 2019). The study combined different data sets to produce statistical models which highlighted key variables that were associated with geographical origin and production year. The study focused on characterising compounds giving rise to those variables, showing the power of high-resolution mass spectrometry to discover new (bio)markers that can be transferred to lower technology platforms.
Official methods for the analysis of olive oil have been critiqued (Conte et al., 2020) and suggestions for improvements made. There is a need to embrace more advanced technologies such as vibrational spectroscopy and NMR spectroscopy, MS, biosensors, and DNA-based approaches. These represent promising alternatives for the authentication and traceability of olive oil, because of their sensitivity, high-throughput, reproducibility and robustness in comparison with conventional methods currently used. However, the cost of the instrumentation may be prohibitive for routine analysis and there is also doubt over the robustness of sampling plans in published studies. Conte et al. (2020) conclude that tightening of (European) legislation and international harmonisation in relation to analytical approaches would help to prevent fraud in this sector.
3.3.4. Wine
Analytical methods applied to wine have recently been reviewed (X. Y. Sun et al., 2022). The authors broadly identified 4 types of approach to wine analysis for the determination of geographical origin as well as vintage and grape variety. Mass spectrometry, spectroscopic techniques, chromatography and other techniques were considered along with a range of multivariate statistical analysis methods. The importance was stressed that an initial critical step is undertaken to ensure that there is good information in relation to geographical origin and any compounding factors such as vintage, for reference samples. The authors conclude that the determination of wine origin using analytical methods is more mature than for the determination of age or grape variety.
Analytical methods are discussed along with their application. Raman spectroscopy and fluorescence spectroscopy are reported to be sparsely used in wine research. Mass spectrometry was believed to be more precise and perform better than spectroscopy and chromatography. However, stable isotope technology (SIRA, ICP-MS and NMR) was expected to continue to be the standard approach for wine traceability due to its high precision and relative ease of application. IR technology was thought to be easy to operate with low cost and size of the instrument aiding portability and uptake in the field. A study found that using FTIR spectra and quality parameters determined by chromatography and mass spectrometry could be combined using a novel multivariate method for the determination of wine origin (Dong et al., 2023).
LC-MS, NMR and GC-MS are the three non-targeted analysis techniques commonly used for authenticity assessment of wine. Sun (X. Y. Sun et al., 2022) believed that the combination of multifactor analysis, instrument detection and chemometrics analysis is the development trend of the wine industry, and the technology for wine authenticity and traceability will become more sensitive, reliable and convenient.
The role of molecular spectroscopy (Chandra et al., 2017) and NMR spectroscopy (Le Mao et al., 2023) have also been discussed at length. The use of vibrational spectroscopy such as near and mid infra-red (NIR/MIR) was discussed alongside the use of Raman spectroscopy and sensor technologies. It was concluded that the use of even fairly simple techniques such as these were in their infancy in the wine sector for the determination of provenance and perhaps held back by a lack of technical knowledge about modern techniques within the sector. In relation to NMR spectroscopy, variations in sample preparation protocols were highlighted as a potential issue when comparing data from different studies. The impact of commercialising tools subsequently reducing transparency is also a barrier to future development.
Le Mao and collaborators (Le Mao et al., 2023) suggested that the affordability and portability of NMR instruments should be improved to aid uptake, with benchtop NMR being one possible solution. This is particularly attractive for wine as some spectroscopies such as near infra-red have been shown to provide valuable information from unopened bottles of wine (Harris et al., 2023), which is particularly pertinent to high value wines. Non-invasive technologies coupled with machine learning or artificial intelligence algorithms (Carneiro et al., 2023) for the authentication of vintage wine would provide a useful advance in fraud prevention.
3.3.5. Coffee
Several techniques have been applied to coffee to determine geographical origin and often these are combined studies looking at botanical source (Arabica or Robusta). Spectroscopic techniques have been most widely applied to coffee and their use has been recently reviewed (Munyendo et al., 2022). The authors discuss the application of vibrational spectroscopy to coffee for a wide range of applications including for the determination of geographical origin. The review suggests that the cited studies give highly variable results when classifying coffee based on its geographical origin, with some studies suggesting that NIR spectroscopy is a useful tool and others directly contradicting these findings. The validity of the studies will be highly dependent on the number and nature of the samples used as a reference collection for training the multivariate analysis models.
NMR spectroscopy has been used to differentiate 603 roasted arabica samples from Brazil, Ethiopia and Colombia (Gottstein et al., 2024). Unlike wine and oil, coffee has significant lipophilic and hydrophilic components so two extractions were used to capture the profile of the coffee. The study showed that it was only possible to differentiate African from South American coffee using PCA and LDA to classify the NMR data.
An LC-HRMS study using coffee from Brazil and Mexico (Artêncio et al., 2023) illustrated how the method could be combined with multivariate statistics to identify key components of the coffee that varied as a function of botanical and geographical origin. Whilst initial results looked promising, the study used only 21 samples with 19 of those from Brazil and 2 from Mexico.
A study of Colombian coffee (Arana et al., 2016) compared a successfully implemented NMR method for verifying the protected origin of Colombian coffee with cheaper alternatives. The study used GCMS and GC-C-IRMS and compared the success of the techniques to NMR. The study found that identification of Colombian coffee when compared to coffee from neighbouring countries Brazil and Peru, was less successful using the GC based approaches than when using NMR.
In a study investigating Arabica and Robusta coffee from Africa, Asia, Central and South America (Mannino et al., 2023), a combination of chemical (UV, HPLC-DAD–MS/MS, GC–MS, and GC-FID) and molecular fingerprinting (PCR-RFLP) was used to discriminate commercial green coffee accessions from different geographical origins. The authors concluded that using a combination of high-throughput metabolomics with phenolic compounds, fatty acids, xanthine derivatives, and melatonin, along with antioxidant power and DNA fingerprinting, they were able to discriminate the two coffee species and partition the individual accessions of the species according to their geographical origin. The study was somewhat limited in size as only 15 coffee samples were used for the analysis.
3.3.6. Rice
A review of the potential for rice fraud was published recently (Sliwinska-Bartel et al., 2021). The lack of a coordinated approach is evident in the diversity of the methodologies applied in a range of studies summarised by the authors. Isolated studies have been used to exemplify a potentially simple set of issues, suggesting that there should be a standardised and international approach to address questions around rice authenticity, potentially at an inter-governmental or international standard body level. The review summarises well issues which may occur, such as blending of rice from different origins. The review also discusses the diversity and proliferation of academic studies in this area but provides no solid evidence that there is ongoing collaboration with stakeholders and reads largely as conjecture of the likelihood of mislabelling of rice origin.
Distinction of rice origin is achieved mainly by the application of ICP-MS and SIRA to determine trace element and stable isotope profiles respectively. This is described in other sections of this report. Other techniques include the use of GC-MS (with or without headspace (HS) capture) to differentiate rice from Asia (China, Korea and Malaysia), often using PCA-LDA to discriminate between volatile profiles (Ch et al., 2021). LC-HRMS, Raman and NMR spectroscopy (Huo et al., 2017) have all be used in separate studies but there is little by way of coordinated conclusions that can be discerned from the published data, short of the robust summary of activities provided in the Sliwinska-Bartel review (Sliwinska-Bartel et al., 2021). The conclusions of this paper are necessarily somewhat vague, indicating that there is a need to coordinate international research to help to focus efforts. Similar conclusions can be drawn from the review of (Wadood et al., 2022) which specifically highlights the effectiveness of IRMS and ICP-MS for the determination of rice origin.
Data analysis methods that have been applied for the authentication of rice have also been reviewed (Maione & Barbosa, 2019). The paper highlights that PCA-LDA is the most commonly applied multivariate analysis tool for the determination of rice origin, whilst highlighting the use of support vector machines (SVM) and artificial neural networks (ANN), bringing AI techniques into use. The paper also introduces the use of image analysis for the verification of rice, but this is largely for the determination of rice variety, which is not always linked to origin and can be robustly addressed using DNA based techniques.
3.3.7. Cocoa
Analytical approaches for the determination of the origin and authenticity of cocoa beans have recently been extensively and robustly reviewed (Fanning et al., 2023). The authors highlight that the many steps in the supply chain and the price difference between ‘fine’ and ‘bulk’ cocoa present a risk in terms of food fraud. Methods to identify geographical origin using quality attributes were assessed, including spectrometry, spectroscopy and sensory studies. The paper suggested that integrating instrumental and sensory attributes will help to identify relevant and comprehensive geographical quality indications. A common theme for cocoa origin determination is a need to transition to more rapid, affordable and non-destructive analytical approaches. The use of advanced data analysis methods (for example AI) will also modernise traditional traceability methods. The need for harmonization of methods and the curation of authentic samples was highlighted as this is needed to produce robust geographical indications to establish cocoa terroir effectively. The authors concluded that it was abundantly clear that geographical origin plays a critical role in the quality of cocoa, with most studies reporting a significant difference in composition in cocoa from different geographical origins, a conclusion that is supported by data held at Fera (unpublished data).
Sensory methods (organoleptic) also effectively differentiated cocoa origin. However, more rapid NIR spectroscopy was able to reproduce the discrimination of cocoa samples obtained with a trained sensory panel. It was recommended that future studies explore more rapid and non-destructive techniques. Exploration of advanced machine learning algorithms (for example AI) to improve the origin classification and prediction would improve such analysis results with the caveat that transparency is a key component. To achieve this, methods and procedures must be harmonized and include reporting of verification of authentic samples used to draw conclusions.
Cocoa beans and cocoa bean products (CACBP) have been subjected to various analytical technologies to determine their origin. Practical application has focussed on spectroscopic techniques such as NIR spectroscopy (Anyidoho et al., 2020), (Teye et al., 2020) largely due to the requirement for field portability into often remote regions of for example West Africa and a need to verify origin close to source. The authors conclude that “more work needs to be done to move this technology from the laboratory applications to real usage in developing countries for optimum global benefits” and “Developing portable and affordable instruments is urgently needed particularly in West Africa. It would make onsite measurement application in developing countries possible and aid global traceability and production of high-quality cocoa beans”. GC-MS successfully classified origin at the country level. Applying NMR spectroscopy as a non-targeted metabolomic approach was also successful. The need to supply metadata has been highlighted. Not all studies reported vital information on samples, such as the cocoa bean variety, which the authors say is necessary for the successful development of a database for cocoa traceability. Peripheral technologies such as Flow Infusion - Electro Spray Ionization - Mass Spectrometry have also been trialled with promising initial results (Acierno et al., 2018) focussing on dark chocolate from a range of origins with the authors highlighting the promising steps made.
Others have also focussed on correlating the composition of cocoa bean products, such as chocolate, to the composition of the cocoa beans used in their manufacture. The polyphenol content of chocolate has been directly related to that in the source cocoa beans using HPLC-DAD-MS and PCA (Cambrai et al., 2017) and shown to be a useful tool to differentiate chocolate containing cocoa beans from Madagascar, Caribbean, South American and African countries for the first time, where previous data has looked only to distinguish cocoa beans present in chocolate that is sourced from different continents, indicating that the development of technology is improving the spatial resolution of the analysis.
3.3.8. Honey
Methods for the determination of honey origin have been comprehensively reviewed by (Soares et al., 2017, funded by the EU 7th Framework Programme Food Integrity: Ensuring the integrity of the European Food Chain) and more recently by (Danieli & Lazzari, 2022).
Both papers describe a range of methods that have been applied to determine the authenticity of honey and discuss botanical and geographical origin determination. Honey is somewhat unique in that its composition is determined by the variability in local flora and this can be used as a basis on which to determine origin. At the most rudimental level, pollen analysis (melissopalynological) using light microscopy can be used to determine the amount of pollen contributing to the overall pollen content by a range of different plant species. Whilst the approach is comparatively low-tech, the skill and experience required to differentiate accurately between pollen types should not be underestimated. This has resulted in a relatively small number of specialists who are able to determine geographical origin based on pollen. The approach also relies on pollen from different plants being distinguishable, which is usually the case. However, some pollen types are indistinguishable (for example Mānuka and Kanuka) and some pollen types will exist in many locations, so the pollen fingerprint of a honey may not uniquely identify its origin. In other cases when pollen native to a particular location can be identified, then pollen analysis is a powerful technique. To reduce the reliance on specialist pollen analysts, there have been recent attempts to use molecular biology techniques to analyse DNA from honey pollen. This can be hindered by a low copy number, but progress is being made with, for example, digital PCR which may allow the approach to become routine (You et al., 2021).
Analytical approaches for the origin determination of honey are covered in different sections of this document, with IRMS and ICP-MS being the most routinely applied, to measure isotopic ratios and trace element composition in honey, which are both well known to be associated with geographical origin. However, the most recent rapid progress for the determination of honey origin is being made by applying a range of techniques to determine the composition of the honey. Targeted and non-targeted methods have been applied to determine biomarkers that are associated with origin. As with pollen analysis, these approaches can usually determine botanical origin, which then by proxy can be used to draw conclusions about geographical origin. Work carried out under by the EU 7th Framework Programme Food Integrity (Soares et al., 2017) identified several classes of compounds that have been used to distinguish both the botanical and geographical origin of honey. Phenolic compounds have been extensively studied, along with volatile components, sugars, organic and amino acids. The technologies used to measure these groups are well established (for example GC-MS or GC-FID for volatiles) and the studies undertaken are described in the previously cited reviews, showing that each class of compound can be used to differentiate between honey, mainly by floral type. These analyses have resulted in a range of markers being identified that are specific to plants commonly associated with honey production. For example, the phenolic compound, hesperetin, has been identified as a marker of citrus honey.
An evaluation of the methods most used in the analysis of the geographical and botanical origin of honey, particularly in the last decade, has been published (Danieli & Lazzari, 2022). Current state-of-the-art technologies include metabolomic/genomic approaches and blockchain. The authors conclude that when methods are used in combination this usually leads to greater accuracy of origin determination, linking data through multivariate statistical or chemometric methods. Similar to other commodities discussed here, a range of techniques have been trailed and found to be able to classify honey by origin with NMR perhaps being the most accepted (Biswas et al., 2023) but with rapid developments being made in the use of MS and other technologies such as Raman spectroscopy.
The development of data fusion and multivariate analysis methods specifically for application to honey is also noteworthy. The potential benefit of data fusion based on different complementary analytical techniques was investigated (Schwolow et al., 2019). Sixty-four honey samples from three different origins were analysed by attenuated total reflection IR spectroscopy (ATR/FTIR) and HS-GC-IMS). The datasets obtained were combined in a low-level data fusion approach with a subsequent multivariate classification by PCA-LDA or PLS-DA. Validation results of the classification models were compared to the results that could be obtained by using the individual data blocks separately. A decreased cross-validation error rate and more robust model was obtained due to the low-level data fusion. The results show that data fusion is an effective strategy for improving the classification performance, particularly for challenging classification tasks such as determining the origin of honey.
The state-of-the-art for honey analysis also incorporates AI technologies for data analysis. For example, machine learning has been used to identify volatile components in citrus honey that are linked to origin (Karabagias & Nayik, 2023).
3.3.9. Meat
Recent food origin issues in the UK have largely been around meat, relating to the incorporation of horsemeat originating from elsewhere in Europe being found in beef labelled products, and more recently, the use of meat in processed products that were inaccurately declared to be from the UK. The scientific basis for these issues has been the subject of conjecture and to a lesser extent to scientific review. The confounding issues relating to meat origin mainly concern the movement of animals and meat products during rearing and supply, which, from an analytical perspective, makes it very difficult to identify the source of meat products. Analytical methods for the detection of meat origin also take little account of animal feed, which influences the biochemical profiles of all of the testing methodologies. For example, isotopic ratios and trace elements, when tested in a meat product will vary as a function of the feed used to produce the animals. Feed can be obtained from many different geographical locations and is a major factor in meat composition (Monahan et al., 2018). Routine testing for origin and the establishment of databases for meat authenticity will fail if they do not account for feeding regimes and the movement of animals. Some success has been found in the use of technologies such as genomics which rely of the genetic code of the animals. However, movement still compromises this as germplasm is often acquired from gene banks (the UK is a leading supplier of beef germplasm, for example) and the animals may be reared elsewhere.
Profiling of the gut microbiome has been proposed, particularly for ruminant animals but this is highly speculative. Meat authenticity is perhaps the most challenging area covered here due to movement and feeding regimes. The lack of any confident review of meat origin determination indicates the need to consider all data (a weight of evidence approach) when assessing meat origin. Challenges pertaining to the interpretation of data, as they relate to assignment of dietary background or geographical origin, are discussed in the review of Monahan (Monahan et al., 2018). They concluded that “among many factors such as the global nature of trade in meat, the complexities of the food chain associated with meat production, consumer demands for more information about the food they consume and the potential for fraud, there seems little doubt that meat authenticity will continue to be a subject of discussion and research into the future”. Furthermore, they say that “From the particular perspective of dietary background and geographical origin, it is clear that a single analyte is unlikely to be adequate and that measurement of multiple markers is required. Determination of the dietary background and geographical origin of meat brings specific challenges not least of which is the possibility for animals to move between different jurisdictions and to consume foods from different sources and from different geographical origins over their lifetime”. In authoring this section, it was felt that a deep dive into analytical technology was inappropriate as the major issues are around establishing traceability systems that can be subsequently tested in the market and may be an area where technology-based solutions such as RFID (radio frequency identification) will continue to allow for the monitoring of animal movement to support international trade. A recent review of beef testing methodologies is provided by (Y. Bai et al., 2023). Seven approaches were discussed in relation to their ability to determine the origin of beef. These are: stable isotope technique, DNA technology, spectroscopic technology, volatilomics technology, metabolomic analysis, fatty acid analysis and mineral element analysis. The main conclusion of the review was that “The supply chain for beef is highly complex”.
3.3.10. Limitations and gaps in using metabolomics and profiling techniques
-
The validity of studies undertaken to date is highly dependent on the number and nature of the samples used as a reference collection. In many studies this is inadequate to reach firm conclusions and there is also doubt over the robustness of sampling plans in published studies.
-
There is little evidence that there is ongoing collaboration to develop the skills/ experience and reference data required to differentiate accurately between origins.
-
The lack of a coordinated approach is evident in the diversity of the methodologies applied in a range of studies.
-
The affordability and portability of instruments should be improved to aid uptake as the cost of the instrumentation may be prohibitive for routine analysis.
-
Commercialising tools subsequently reduces transparency and is a barrier to future development.
-
The confounding issues relating to meat origin mainly concern the movement of animals and meat products during rearing and supply, which, from an analytical perspective, makes it very difficult to identify the source of meat products. Analytical methods for the detection of meat origin also take little account of animal feed. Routine testing for origin and the establishment of databases for meat authenticity will fail if they do not account for feeding regimes and the movement of animals.
3.3.11. Conclusions for using metabolomics and profiling techniques
-
Using a combination of methods analysing different types of food compounds seems to be the most promising approach to establish geographical origin, recognising the need for statistical approaches to bring together the data. When methods are used in combination this usually leads to greater accuracy of origin determination, linking data through multivariate statistical or chemometric methods.
-
Using multiple data sets from different analytical sources generally improved classification rates and therefore the robustness of origin determination. Data fusion is an effective strategy for improving the classification performance.
-
An initial critical step must be undertaken to ensure that there is good information in relation to geographical origin and any compounding factors such as vintage, to ensure that robust metadata are collected.
-
The determination of provenance is perhaps held back by a lack of technical knowledge about modern techniques within specific sectors.
-
Integrating instrumental and sensory attributes will help to identify relevant and comprehensive geographical quality indications.
-
Geographical origin plays a critical role in determining the quality of products such as cocoa.
-
Non-targeted methods can provide highly indicative data to support a weight of evidence approach for origin determination, particularly when taken with the results of other analytical tests and data from traceability systems.
3.3.12. Outlook in using metabolomics and profiling techniques
Metabolomics and profiling techniques will continue to play a role supporting geographical origin verification, especially in combination with other methods. Linking data through multivariate statistical or chemometric methods is an effective strategy for improving the classification performance.
There is a need to transition to more rapid, affordable and non-destructive analytical approaches. Non-invasive profiling technologies coupled with machine learning or artificial intelligence algorithms would provide a useful advance in fraud prevention. Using AI techniques requires the caveat that transparency is a key component. For some commodities, particularly in developing countries, onsite measurement is highly desirable and would aid global traceability and production of high-quality products.
There is also a need to coordinate international research to help to focus efforts appropriately. There should be a standardised and international approach to address questions around food origin, potentially at an inter-governmental or international standard body level. The need for harmonisation of methods and the curation of authentic samples is highlighted as essential to produce robust geographical indications to establish terroir effectively. Tightening of (European) legislation and international harmonisation in relation to analytical approaches would help to prevent fraud.
Best practice requires that studies consider all data (a weight of evidence approach) when assessing origin.
Technology-based solutions such as RFID (radio frequency identification) should continue to be developed in conjunction with analytical methods, for the monitoring of the movement of products to support international trade.
3.4. Genomics
Genetic analysis for the assignment of geographical origin has not been widely used. The development of genetic tests, using the analysis of DNA, suitable for the assignment of geographic origin is inherently difficult. Such genetic tests would rely on the identification of private or near private DNA markers (Fontanesi, 2009). Animals and plants of different populations, but of the same species, can freely exchange genetic material by cross breeding, as there are no preventative biological barriers. Only selective breeding or genetic isolation can establish a unique sub-population of a species and only if then anchored to a discreet location, can geographical origin be assigned. To date the identification of these private genetic markers has been challenging, however, the development of next generation sequencing (NGS) which is able to scrutinise the genome in more detail, has enabled studies which may lead to tests able to assign geographical origin in the future.
3.4.1. Fish and seafood
An example of genetic isolation leading to 2 distinct identifiable sub-populations in Baltic Sea cod (Gadus morhua), was published by Hemmer-Hansen et al. in 2019. Western and eastern Baltic cod have overlapping geographical locations, with a mixing zone in the Arkona Basin waters, south of Sweden. These fish populations would ordinarily be expected to interbreed. However, the two populations have temporally distinct spawning times. While western Baltic cod spawning is restricted to a few weeks in early spring, eastern Baltic cod spawn over a prolonged period of time peaking in the summer months (Hüssy, 2011). Environmental conditions at spawning time also add to genetic isolation in that they rarely support the survival of eggs of the western Baltic cod after the spring spawning period.
This study developed a panel of SNP which had high levels of population differentiation between eastern and western Baltic Sea cod. The panel was then used to analyse 2000 fish tissue samples collected between 2011 and 2015 in the mixing zone. The study was able to assign fish to either the eastern or western population with a very high degree of confidence (Hemmer-Hansen et al., 2019).
This study focused solely on the Baltic Sea cod population, however, it is known that there is mixing between the North Sea and western Baltic Sea cod, with individuals in that mixing zone of mixed genetic heritage (Berg et al., 2015). It is not known how those fish would be classified using this panel of SNPs. This study was designed to facilitate the Baltic Sea cod fishery and was not designed to address the potential to assign GO of cod across the full geographical range. Nevertheless, the panel of SNPs developed during this study was highly informative and further work, using a greater number of samples from the full natural range of the Atlantic cod, may potentially reveal separate populations of cod based on geographical origin in the future. Spatial resolution in the marine environment is usually limited to large geographic distances because marine organisms generally show shallow population structure due to high dispersal capacity in the absence of strong barriers to gene flow. Next generation sequencing technologies, however, now allow the identification of hundreds/thousands of SNP using the genotyping-by-sequencing (GBS) approach and microbial community analysis using NGS-generated microbiome profiles (Milan et al., 2019).
A study compared the analysis of foot tissue SNPs, with NGS-generated microbiome profiles of the hepatopancreas of the shellfish Ruditapes philippinarum (Manila clam), a shellfish of high commercial interest with worldwide distribution (Milan et al., 2019). It was found that the clam place of origin could be located with high spatial resolution using the NGS-generated microbiome profiles of the hepatopancreas. Samples were collected in June and December/January in four areas of the Venice lagoon, over 2 years, although not all areas were sampled at each time point. Nevertheless, the spatial resolution for the origin of the clams remained over time. This initial study appears to show promise with regards to assigning geographical origin to Manila clams, however, the geographical area sampled was restricted to the Venice lagoon: an area with a distinctive marine environment. A wider study, using more samples from geographically diverse areas, is needed to confirm and extend this result, since work at Fera has shown that this type of analysis is not applicable to another type of shellfish: oysters (see Section 3.4.7).
A similar study was performed on the soft-shell clam species Mya arenaria collected in Canadian waters (X. J. Liu et al., 2020). The NGS-generated microbiome profiles of the clams could be reliably differentiated by harvest site, which remained true over the 3-year sampling period. The microbial diversity of these freshly harvested clams was much higher when compared with batches of clams from retail samples. The processing of the retail samples was unknown, but likely involved depuration, a process in which shellfish are held in a tank of clean seawater prior to retail, which results in the expulsion of intestinal contents and improves the safety of the final product. In this study, the operational taxonomic unit (OTU) which is a measure of microbial diversity, fell from 2994 OTUs in the fresh samples to 149 OTUs in the retail samples. Additionally, the microbial community of the retail samples was heavily dominated by Proteobacteria, a typical spoilage organism for fresh seafood. It has previously been shown that the microbiome of shellfish become more similar as they spoil (Madigan et al., 2014). Depuration and spoilage may well reduce the usefulness of this type of analysis for the attribution of geographical origin for shellfish. Interestingly, the study above, on Manila clams, was performed on fresh non-depurated samples (Milan et al., 2019). It will be important to extend these analyses along the whole production chain, from sea to market, to better understand where the change in the microbiome community occurs. The depuration/spoilage-induced loss of microbial diversity makes determining the geographical origin of shellfish more difficult in retail samples than in freshly harvested samples.
3.4.2. Meat
A small study analysed the NGS-generated microbiome profiles of the Italian fermented sausage, Salame Piemonte (PGI) (Franciosa et al., 2021). Three batches of the salame were manufactured in February, March and May and were then allowed to ferment and ripen under the usual temperature and humidity conditions used for this product. The microbiome profiles of the three batches clearly showed inter-batch variability with Pediococcus pentosaceus, Latilactobacillus curvatus and L. sakei associated with samples from February, March and May, respectively. Further, no consistent microbial profile was identified for the salame across the three batches. Therefore, inter-batch variability negates the use of this type of analysis for this product.
3.4.3. Honey
An example of genetic isolation facilitating the designation of geographical origin is that of Mānuka honey. A study by Chagné et al. published in 2023, used a panel of SNPs to investigate the population structure of Leptospermum scoparium J.R.Forst & G.Forst, called Mānuka by Māori, and the basis of a flourishing honey industry in Aotearoa New Zealand and Australia. A previous study had been able to genetically differentiate Australian and New Zealand L. scoparium but had only used a single sample from the Australian island of Tasmania (Koot et al., 2022). The current study analysed 86 samples from Tasmania together with 418 samples from New Zealand using a high-density SNP array containing 9002 SNP for mānuka. SNP arrays are a widely used technology for genotyping DNA of many organisms, including animals and plants for selective breeding or pedigree analysis, as well as elucidating phylogeographic patterns and taxonomic structure (Montanari et al., 2022). This study reported a strong genetic differentiation between the New Zealand samples and those from Tasmania. They also controversially stated that the differences were so large that the Australian L. scoparium trees be subject to taxonomic revision and that honeys marketed from L. scoparium growing in Australia, should not be called by the Māori term Mānuka (Chagné et al., 2023).
It should be noted however, that this report focused on genotyping leaves, whereas the commodity requiring geographical attribution is the honey from bees foraging on the L. scoparium blossom. Further work, therefore, to implement the use of this high-density SNP array for the verification of geographical origin of mānuka honey, would need to focus on samples of mānuka honey from New Zealand and Australia. In particular, the analysis of pollen from L. scoparium blossom found in mānuka honey would need to be assessed for compatibility with the use of high-density SNP arrays.
The traditional and most commonly used method for defining the regional origin of honey is melissopalynology, which aims to identify the plants from which bees have collected the nectar based on the morphological characteristics of the pollen found in the honey (Wirta et al., 2021). However, although low cost, melissopalynology requires expert knowledge and reference collections. In recent years NGS has been proposed as a high-throughput alternative, analysing the pollen extracted from honey (Özkök et al., 2023), or the pollen together with the DNA of plants, bacteria and fungi present in the honey (Wirta et al., 2021).
A recent study (Özkök et al., 2023) aimed to compare the melissopalynological method to NGS to determine honey’s botanical and geographical origin. A total of 74 honey samples were collected from 16 different areas of Turkey. The samples were both subjected to melissopalynological analysis and NGS of two chloroplast genes, for which a database containing thousands of accessions is publicly available. The results showed that NGS analysis could detect more diversity in the plant species than could melissopalynological analysis and although the findings using both techniques were not identical, both were able to determine the honey’s dominant components. Cluster analysis of the NGS data revealed clustering according to broad geographical location in Turkey. Notable exceptions were where samples were taken from geographically distant regions, but which had similar flora, resulting in them clustering together. This is a major drawback of studies such as this, in that the floral landscape is usually a continuum with poorly defined boundaries and regions with similar climate, soil and altitude could have similar floristic profiles. Additionally, neighbouring countries will have similar floral landscapes further confounding geographical attribution using pollen analysis.
A similar approach was taken in a study on Iranian honey (Khansaritoreh et al., 2020). The majority of Iranian honey is produced in four provinces located in the Irano-Turanian floristic region. The climatic conditions in this region result in a short flowering season and poor nectar production. The beekeepers of colder cities migrate to warmer areas in early autumn and stay at that destination until mid-spring, making the assessment of pollen for geographical assignment complicated. This study concentrated on identifying the key species which represented the flora of a specific location using an extensive literature review of reference books on the flora of Iran. NGS analysis of two separate genes was used to determine the floral profile of pollen from 70 samples of honey. The analysis was able to accurately plot the migration of the hives from their summer to winter feeding grounds. It was particularly successful because of the substantial difference between the flora of the north compared to the west and south-west of Iran. It was a limited study and did not obtain any samples from neighbouring countries, each of which would also, as before, have a similar floral landscape as Iran.
The approach taken by Wirta and collaborators (Wirta et al., 2021) was to use NGS to assess the presence of plant tissue, bacteria and fungi in honey from three neighbouring countries: Sweden, Finland and Estonia. This small study, analysing 48 commercial honey samples, was able to distinguish the country of origin using only the plant and fungal profiles of the honey, whereas the bacterial profiles did not separate the honey samples by country. Interestingly, filtering the honey, a common practice which makes honey remain liquid for longer, had only a minor effect on the taxonomic recovery of DNA. This approach, which does not rely on the pollen profile, is promising, however, the number of samples and the range of countries studied would need to be extended to confirm efficacy for the attribution of geographical origin.
3.4.4. Cereal
The genetic diversity and population structure of wheat has been the subject of many studies using DNA-based analyses. A small number of studies have attempted to link clusters of samples with their geographical origin. Combined analysis of random amplified polymorphic DNA and inter-simple sequence repeat markers was used to generate the population structure of wheat in India and Turkey. The study found that the Turkish hexaploid varieties were divided into two clusters, one group showed a close association with Indian hexaploid varieties and the other with Indian tetraploid varieties (Khan et al., 2015). This exemplifies one of the largest draw backs for the attribution of geographical origin using DNA analysis of any kind, in crops which have undergone extensive breeding. This is particularly so for wheat because most modern cultivars can be traced back to a breeding program based in Foggia, southern Italy, under the direction of Nazareno Strampelli (Kabbaj et al., 2017). Additionally, understanding wheat population structure as it pertains to geographical origin is hampered by the increase in germplasm exchange between breeding centers, causing changes to the historical structure of genetic diversity (Brbaklic et al., 2015). A large study, using a high-density SNP array to analyse 370 samples from 32 countries, found that the greatest influence on clustering was from the breeding programmes in different centers (Kabbaj et al., 2017) negating the influence of geographical origin. The study by Brbaklić used microsatellite analysis on 284 wheat varieties to cluster the samples in a slightly different way: into 6 sub-populations. Although none of the sub-populations were comprised of samples from a discreet geographical origin, most of the cultivars from the same country clustered together. Cultivars clustering with a different country, were most probably bred from a common or related ancestor. Nevertheless, clustering into six sub-populations does not offer much granularity for the attribution of geographical origin (Brbaklic et al., 2015). Similarly, a study using a high-density SNP array was able to separate 726 samples broadly into seven groups: 5 are geographical (Asia, Australia, Canada, Europe or the USA), one is historical landraces, and one is durum wheat varieties. The authors found that there was a high number of shared alleles across the 5 geographical locations, reflecting the use of common ancestors for breeding and relatively little development of regional populations by out-crossing with locally adapted varieties (S. C. Wang et al., 2014). Although this study was not designed for geographical origin assignment, the use of the high-density SNP array was able to start to separate some populations into geographical locations. It remains to be seen if this approach would be able to detect differences in commercial samples taken from different countries.
A SNP based study of 406 samples of spring barley, appeared to show some geographical origin association for sample origin (Genievskaya et al., 2023). Samples were clustered into five sub-populations. Only two of these sub-populations contained samples from a single country, the USA (Sub-population 3) or Kazakhstan (Sub-population 2). The USA was also represented in two other sub-populations (4 & 5), whereas Kazakhstan was represented in three other sub-populations (1, 4 & 5). Europe and Africa were represented in the same two sub-populations (1 & 4), which also contained either Kazakhstan alone or Kazakhstan and the USA. Therefore, if a sample was assigned to Sub-population 4 it could originate from any of the four regions, if assigned to Sub-population 5 it could originate from the USA or Kazakhstan. This method, therefore, is not an appropriate approach for attributing geographical origin to spring barley samples and reflects the worldwide trade of spring barley.
3.4.5. Vanilla
Consideration of vanilla exemplifies some of the other issues around using genetic analysis for assignment of geographical origin. Produced from the cured seed pods of the tropical orchid genus Vanilla, it is the second most valuable spice in the world, after saffron. Originating in Mexico, it is now grown extensively in Madagascar, Indonesia and Mexico. Vanilla is only able to reproduce naturally within its native range in Mexico where it co-occurs with specialised pollinators and dispersers and so is usually propagated via stem cuttings from a plant which has not been allowed to flower (Lozano Rodríguez et al., 2022). Due to the absence of natural genetic recombination through pollination and clonal vegetative propagation, the genetic diversity of globally cultivated Vanilla has become severely constrained, and is essentially genetically identical the world over (Ellestad et al., 2022). This lack of genetic variability precludes any form of DNA based analysis.
3.4.6. Saffron
Similarly, saffron is propagated asexually through division and replanting corms. A study by Busconi and co-workers found that there was little genetic variability between saffron samples from Europe, Africa and Asia, reflecting clonal propagation methods. This study focused mainly on Spanish saffron but did include 39 samples from around the world. Using amplified fragment length polymorphism analysis, the study found that a predominant genotype (A1) was present in Spanish, French and Greek samples, which all showed a uniform genetic make-up. The A1 genotype was also present in all the other production areas studied. There were however eleven less frequent genotypes identified which were restricted to single locations, although the numbers of samples from each location varied only between three and five samples. The study however found high epigenetic variation between the samples which was consistent with their geographical origin. Epigenetic variations are influenced by environmental conditions and the study theorised that it would be very likely that samples from cultivation areas under different climates could be characterised by different epigenomes. The techniques used in this study are relatively insensitive with only 47 polymorphic loci identified (Busconi et al., 2015). The use of more modern techniques, perhaps high-density SNP arrays or GBS could identify more geographically anchored epigenomes: a first step in geographical attribution of saffron.
The microbiome fingerprint of saffron corms was investigated in a small feasibility study across three production sites: Morocco, Kashmir (India) and Kishtwar (India) (Bhagat et al., 2021) with a view to determine if the cultivation areas could be distinguished from each other. It was found that the cormosphere, the collection of all microbes (for example bacteria, fungi, viruses) associated with the corm, across the three sites harboured common phyla, with 24 genera found at all locations. However, there were some bacterial genera unique to each of Kashmir, Kishtwar and Morocco, which could be used to develop microbial markers for the geographical origin of corms from these regions. One significant drawback of this study is that it was the cormosphere which was studied and yet it is the stamen of the plant which enters the human food chain. It would be interesting to determine if the variation in the cormosphere extended to the stamen, otherwise, unfortunately this study is of little use for the attribution of geographical origin of saffron stamens.
3.4.7. Government Research Projects using genomic analysis
3.4.7.1. EU Framework 6 Project TRACE: Tracing (the Origin of) Food Commodities in Europe.(2005-2009)
An objective of Work Package 3 was the identification of PCR markers for the detection of plant species related to honey. This Work Package aimed to develop rapid, robust, accurate and cost-effective methods for determining the species/varietal origin of food using real-time PCR. The project focused on the development of methods which could authenticate ‘Miel de Corse’, a product of protected designation of origin (PDO). The study developed and validated real-time PCR systems for the detection of plant species commonly found on Corsica: sweet chestnut, lavender, eucalyptus, rockrose, oak and broom, which should be used to build a profile of the plants used by bees as forage during the production of “Miel de Corse”. Additionally, detection systems using real-time PCR for other plant species commonly identified in honey (acacia, linden, citrus, clover, heather, olive, rape, sunflower and rosemary) were also developed and validated. These real-time PCR systems were then used to distinguish Corsican honey from samples of honey from other geographic regions, including “Miel de Galicia”, a protected geographical indication (PGI) honey as well as German and English honeys were analysed.
This study showed that a combination of species specific systems, selected from the pool of real-time PCR systems developed during this project, was able to produce a plant species profile unique to Corsican honey when compared to honey from Galicia, Germany and England (Laube et al., 2010), although the number of samples analysed was comparatively small and the locations relatively geographically remote from each other. Honey from Germany and England were easily distinguishable from Southern European honey by the detection of DNA from oilseed rape: a favourite foraging flower for bees, but not a crop grown in Southern Europe.
3.4.7.2. Development of Metagenomic Methods for Determination of Origin (Defra reference FA0160)
Between 2015 and 2019 this project piloted the application of microbial metabarcoding to trial the identification of microbial communities related to the origin of i) oysters, and ii) Stilton cheese.
The overall aim of this project was to use the microbial community associated with oyster gills as a blueprint for the application of metagenomics in food authenticity, and then to test the approach in a new commodity, blue Stilton cheese (a Protected Designation of Origin commodity). More specifically this involved the following objectives:
-
Transfer methodology from the obsolete Roche 454 (Roche Diagnostic, Netherlands) DNA sequencing platform to the current MiSeq (Illumina, UK) DNA sequencer.
-
Increase replication for the oyster analysis, including samples from multiple sites and seasons and a non-UK source (northern France).
-
Attempt to identify regional signatures in the oyster bacterial community, and uncertainty in the approach.
-
Apply the methodology to another commodity, PDO Blue Stilton cheese.
To address these aims, a total of 450 oyster samples were collected by the Centre for Environment, Fisheries and Aquaculture Science (Cefas) from around England, from sites in Cornwall, Dorset, Essex and Northumberland. Ten oysters were collected at each collection visit, and sampling took place from April 2015 to November 2016. A further 60 oysters were collected from two sites in northern France, between October 2016 and February 2017. Oyster gills underwent DNA extraction and PCR amplification at the 16S v4 variable region, followed by sequencing on an MiSeq (Illumina, UK) DNA sequencing instrument. A total of 504 samples were successfully sequenced for further analysis. Sequences were clustered into groups based on their sequence similarity, and representative members of these groups were compared to a database of known bacterial sequences, to determine which bacterial taxa were present in which samples. After sequence analysis was complete, Bayesian statistical methods were used to attempt to identify different groups of taxa that are associated with different geographical origins.
Once this methodology was established, it was rolled out to Stilton Cheese. Across 2016 and 2017, 101 cheese samples were taken, both directly from the various Stilton producers, or from retail sale in the UK. Samples were comprised of Blue Stilton cheese (n=62), and for comparison UK non-Stilton blue cheese (n=30) and non-UK blue cheese (n=9). Samples were processed as per the oyster samples, except that two genes were analysed; the 16S rRNA gene, for identification of bacteria, and the Internal Transcribed Spacer (ITS) region, for identification of fungi.
The bacterial communities in the oyster samples were highly variable, even among samples from the same time and location, and oysters could not be assigned to regions based on bacterial sequences. The blue cheese data were more promising, and samples could be assigned as Blue Stilton with 72-79% accuracy. There was also some ability to assign to individual producers (40-60% accuracy).
This work would benefit from further work in at least two areas – re-analysis of the existing data with more modern analytical techniques (for example de-noising and Amplicon Sequence Variant analysis, instead of Operational Taxonomic Unit clustering), and trialling on other foodstuffs with a microbial component to their production.
3.4.8. Limitations and gaps in using genomics techniques
In summary, this literature review did not locate any studies where the use of DNA-based analyses was mature enough to be put into practice. The majority of studies could at best, be called pilot studies, due to restricted numbers of samples, low numbers of locations and little sampling between years and seasons. Methods which assess the DNA of the target species are very difficult to anchor to a geographical location, for example the breed of Korean black pigs, which is normally restricted to Korea and was described using microsatellite markers (Oh et al., 2014), could equally be reared in any other location. Swamp and river buffalo are usually found in South east Asia, and was described using a SNP panel, but could be reared in any comparably warm water body (Pérez-Pardal et al., 2018). Putre’s oregano, a subspecies of oregano recognised in Chile with a Seal of Approval (Contreras et al., 2021) and described using sequencing and a SNP panel, could be grown in any other location where oregano grows.
3.4.9. Conclusions for using genomics techniques
In conclusion, DNA based analysis is not currently well placed for designating geographical origin. It can however add value, to traditional chemical-based analyses, in a combined approach, to verify the organism or product of interest, thereby increasing the veracity of the final geographical origin designation.
3.4.10. Outlook in using genomics techniques
What is undeniable is that DNA based analyses are unmatched in being able to identify species and to further differentiate subspecies, varieties and breeds. Where DNA can probably add the most value in ascribing geographical origin is when it is used in tandem with a more traditionally used technology for geographical origin determination. An example of this is a combined use of high-throughput sequencing using NGS with metabolomics using GC-TOF-MS analysis on Panxian ham, a traditional Chinese dry-cured ham protected by national geographical indication (Mu et al., 2020). Other study (J. R. Fernandes et al., 2015) combined a couple of DNA based methods to identify the variety of grape vine for Portuguese wine and was able to assess geographical origin using 87Sr/86SR isotope ratio data. The further development of SNP databases could provide a powerful tool to be used in tandem with other chemometric techniques.
3.5. Proteomics
Publications reporting the use of proteomic approaches for determination of geographical origin of food are scarce, with only a few studies identified during the review of published and grey literature performed for this project, as outlined below.
3.5.1. Honey
A recent review of proteomics for food authenticity (Afzaal et al., 2022) included one article that described the application of proteomics to geographical origin determination of honey (J. Wang et al., 2009). This work was based on fingerprinting of proteins by matrix assisted laser desorption ionization time-of-flight mass spectrometry (MALDI-TOF MS) and protein barcoding generated using MALDI Biotyper 1.1 software to analyse the protein profiles. A database of protein profiles (barcodes) was created using mass spectra of 16 honey samples of known Hawaii origin. Commercially purchased honeys (n=38) with labels indicating the origin from different countries and various states of the USA, including Hawaii (n=15), were tested against the database. All the samples from Hawaii showed a good correlation with the barcodes from the authentic honeys in the database (correlation coefficient ≥ 0.75), although four samples from other countries or US states also showed correlation coefficients in the same range. Although these are preliminary data and they look interesting, a larger database and further analysis of many more samples from different regions would be needed to assess the performance of the approach.
3.5.2. Cocoa
A proteomics approach was used to compare the protein profiles of 25 cocoa samples from different continents and plant hybrid varieties (Kumari et al., 2018). The study involved two-dimensional separation of proteins by gel electrophoresis followed by trypsin digestion, mass spectrometry analysis and protein identification. Statistical analysis of the results obtained with non-fermented beans (proteins detected and their abundances) showed a correlation with the geographical origin of the samples (at continental level) and not with the cocoa hybrid. The authors also observed a correlation between the total protein content of the non-fermented samples and the soil pH, and they speculate that the quality of the soil and the growth conditions may influence the protein accumulation in the plants. Although these results provide an interesting characterization of the cocoa beans studied, the approach is very laborious and not suitable for routine testing. The authors discussed the challenges of protein quantification as different methods produce different results due to the complexity of the cocoa matrix. This is likely to be the case for protein extraction also, which would need to be optimised and standardised to achieved robust data. Although the protein profile correlated with the continent of origin of the cocoa and not with the hybrid variety, there was one variety, CCN5, that was very different to the rest and made South American samples cluster with African samples in a PCA plot. This indicates that a sufficiently large number of samples from different locations and hybrid varieties would be required to fully assess the performance of this approach.
3.5.3. Limitations and gaps in using proteomics techniques
The lack of studies on the use of proteomics for geographical origin designation in food stuffs reflects the inherent problems with anchoring protein profiles to a unique geographical location. Proteins are ultimately derived from the DNA sequence of the target organism and, as with genomic analysis above, it is difficult to link genomic sequences to a geographical location when the intervention of man is taken into account. This is exemplified by the study on Hawai’ian honey. The Hawai’ian Islands are home to a large array of unique plants, due to its geographically isolated location. It has been estimated that there are approximately 1,400 vascular plant taxa (including species, subspecies, and varieties) native to the State of Hawai’i, of which, nearly 90 percent are found nowhere else in the world. It would be assumed, therefore that this geographic isolation would produce clearly distinct and unique barcodes for Hawai’ian honey, however, the arrival of humans brought new plant species to Hawai’i, not only agricultural plants, but also ornamental plants. The attribution of Hawai’ian honey to an Hawai’ian provenance was impressive, save for the fact that almost 10% of non-Hawai’ian honeys would have been wrongly classified as Hawai’ian (J. Wang et al., 2009).
Although less advanced than other technologies, the use of proteomics for geographical origin determination is advocated by some due to certain advantages such as the higher stability of peptides to processing compared to DNA, or the effect of processing and production methods on specific chemical modifications of amino acids or formation of peptides that can be used as biomarkers (Ortea et al., 2016). The Greek consortium FoodOmicsGR_RI, coordinating the development of Omics tools to support the agri-food sector are focusing their proteomics efforts on cheese and they are using nano-LC / HRMS to characterise the proteome and peptidome of Greek cheeses. The aim is to create a database of proteins and peptides identified in the original milk and in serial samples collected during the production and maturation processes. This consortium has also used MALDI-TOF MS to create a reference library of bacteria present in various Greek cheeses based on ribosomal proteins along with some housekeeping proteins. The use of Bruker MALDI Biotyper software enables pattern-matching to the reference library for identification of bacterial species present in different cheeses and their correlation with quality and safety traits. The authors claim that this approach can also be applied to PDO determination by studying the microbial diversity of different PDOs. Limitations that need to be addressed for further development of proteomic approaches for geographical origin determination include standardisation or detailed reporting of experimental details, the use of powerful statistical analysis and appropriate number of replicates, access to advanced MS equipment and bioinformatics tools. As for other technologies, extensive, well curated databases are critical to obtain robust data.
3.5.4. Conclusions for using proteomic techniques
In conclusion, as with DNA based analysis, the use of proteomic techniques is not currently well placed for designating geographical origin. They can, however, add value to traditional methodologies, in a combined approach, to verify the organism or product of interest, thereby increasing the confidence in the final geographical origin designation.
3.5.5. Outlook in using proteomics techniques
Due to the technical complexity and economic cost of proteomic approaches for geographical origin determination, it can be anticipated that developments in the short and mid-term will be targeted to specific commodities and geographical indication questions such as PDO and PGI to support marketability and brand protection.
3.6. Emerging techniques
This section focuses on the wide range of emerging technologies and computational tools which have been used in a research and development setting and which have been applied to a small number of samples and commodities. Some of these more promising technologies are discussed below, and a summary of other techniques identified in this literature search is presented in Table 4.
3.6.1. Corona Discharge Mass Spectrometry
Corona Discharge Mass Spectrometry (CD-MS) was applied for the first time to categorise black (n=11) and white (n=7) pepper seeds from different origins (Charoensumran et al., 2021). This simple and rapid MS-based method combined with chemometric analysis demonstrated the ability to distinguish geographic origins from chemical profiles with discrimination efficiencies > 98%. The authors state that that this approach has potential to be used as part of a portable setup for remote and onsite analysis. Despite the low dataset numbers, this outcome suggests the methodology could be applied to a broader range of agricultural products for origin determination without any pre-treatment. Any future applications should focus on larger datasets and a broader range of samples.
3.6.2. Excitation-Emission matrix and Synchronous Fluorescence
Excitation-Emission matrix (EEM) and Synchronous Fluorescence was used for the first time to discriminate saffron from 3 different Moroccan provenances (n=18) (El Hani et al., 2023). Moreover, global geographic discrimination between samples from Morocco, Afghanistan (n=10) and Iran (n=10) was achievable through PCA and LDA. Despite the low dataset numbers in this study and the limited geographic dataset, this non-destructive, simple, and reliable method could precede verification of saffron adulteration and quantification of fluorophore compounds for valorisation in applications such as cosmetics, food, and pharmacy.
3.6.3. Intelligent Sensory Technologies
PCA of Electronic nose (E-nose) data could rapidly distinguish samples of fresh instant rice (n=18) from three different Chinese provinces (Ren et al., 2023). The solid phase microextraction-gas chromatograph-mass spectrometer (SPME-GC-MS) results of samples from different provinces were clearly distinguished in PCA and hierarchical cluster analysis. Ten compounds were identified as potential markers of three Northeast Chinese fresh instant rice provenance. The authors conclude that the strategy of applying flavour profiles for determining geographical origin of fresh instant rice was an effective and non-destructive technique. Although this study is from a limited area in China on a specific product it stands as a proof of concept for future analysis of more diverse sample types
E-nose was also utilised in a study to discriminate the geographical origin of ginger (D. X. Yu et al., 2022). In this study, HS-GC–MS and fast GC e-nose were used to successfully distinguish the varieties and geographical origins of dried gingers from seven major production areas in China. After chemometric analysis distinct separation of two different varieties of ginger was achieved on the HS-GC-MS data. However, this method was not effective for origin identification. Flavour profiles extracted by fast GC e-nose demonstrated better identification of ginger varieties and geographical origins based on pluralistic chemometrics and could be applied to trace source and region of ginger. This study provided evidence of applicability of these technologies to ginger authenticity within China.
A novel voltametric electronic tongue (VE-tongue) system based on three nanocomposites modified working electrodes was used for the discrimination of red wine from different geographical origins (Zheng et al., 2022). Wine samples (n=120) from four different denominations of origin (France, Australia and two locations in China) were bought from a local supermarket and three working electrodes were applied for classification and prediction using PCA plot. The authors conclude that the three novel working electrodes delivered a VE-tongue system that can successfully discriminate different red wine samples by their denomination of origins, thus cutting down the detection cost of a versatile E-tongue system without interfering in the discrimination capacity of the system. The method would require further validation against a larger dataset to test and improve this model.
A paper comprising a comprehensive literature review addressing the authenticity determination of a variety of alcoholic beverages through intelligent sensory technology (IST) was presented by Wang et al. (2022). The techniques covered were E-nose, E-tongue and E-eyes. They concluded that IST have been successfully applied in quality assessments of alcoholic beverages, in terms of variety and geographical origins, monitoring production processes, detection of frauds and adulterations, discrimination of years of aging, distinction of brands and types, aroma analysis, detection of spoilage and off-flavours, and monitoring of the production process. However, the E-noses and E-tongues instruments still need improvement, especially the development of high sensitivity and selectivity bioelectronics sensor arrays aimed at improving accuracy and reliability of the analysis.
3.6.4. Sesquiterpene hydrocarbon fingerprinting by Headspace – Solid Phase Micro Extraction and Gas Chromatography-Mass spectrometry
Headspace – solid phase micro extraction (HS-SPME) and GC-MS was shown to be the fit-for-purpose tool for virgin olive oil geographical authentication (Quintanilla-Casas et al., 2022). Virgin olive oil produced from EU (n = 246) and non-EU (n = 154) origins were correctly classified using chemometrics for 89.6% of samples during external validation experiments. The SH fingerprint provided a large amount of information, but the PLS-DA allowed discrimination of the most relevant variables according to the origin categories. Successful results were also obtained for classification models by country. Between EU countries the model correctly classified 92.2% and between non-EU countries 96.0%. These are remarkably high classification rates considering the natural heterogeneity of the oil and analytical variability. The group concluded that the proposed approach could be scaled down to authenticate the origin of oils obtained from smaller and closer areas of origin.
3.6.5. GC-Ion Mobility Spectrometry
Palm oil is one of the most economically important products in Malaysia with Malaysia and Indonesia being responsible for 85% of global palm oil production and the industry is predicted to expand due to demand. Palm oil is commonly traded as a global commodity with batches from different sources often being mixed at multiple stages during processing, shipment, refining, storage, and delivery to end users. The traceability of palm oil within the supply chain is a challenging issue. Several certification schemes, such as Malaysian Sustainable Palm Oil (MSPO) and the Roundtable on Sustainable Palm Oil (RSPO), have been set up to assure the sustainability of palm oil production and its traceability across the supply chain. The current measures in place for the traceability certification of palm oil production are based largely on paper trails and audits. Work has been undertaken by IAEA using gas chromatography ion mobility spectrometry (GC-IMS), coupled with principal component analysis (PCA) and orthogonal projections to latent structures discriminant analysis (OPLS-DA), applied for the geographical discrimination of crude palm oils from Malaysia (“Food Safety and Control Newsletter Vol. 01 No. 1, July 2022,” 2022). Crude palm oil samples were collected over 6 months (February–July 2019) at 4 different Malaysian locations. A supervised chemometric approach, OPLS-DA, was able to discriminate East Malaysia from the Peninsular Malaysia in most of the months. An example of the OPLS-DA score plot of the oil samples, collected in July 2019, is shown in Figure 1. The goodness of fit (R2X(cum), R2Y(cum)) and the predictive ability (Q2(cum)) values of the 7-fold cross-validated OPLS-DA model were 0.898, 0.899 and 0.745, respectively.
As the next step, we assessed the ability of the OPLS-DA model to discriminate all 4 geographical locations. The discrimination of samples from Peninsular Malaysia was achieved in most of the months. Further work on the geographical discrimination of crude oil samples at IAEA has included the use of FT-NIR spectroscopy and stable isotope analysis.
3.6.6. Spectroscopy
Solid Phase microextraction-Gas chromatography-Mass spectrometry, IR and Raman techniques were reviewed as an application to authenticate the volatile profile of honey for botanical and geographic characterisation (Sotiropoulou et al., 2021). Chemometric analysis was performed on the data from all techniques reviewed. SPME-GC-MS identified volatile compounds in honey that could be used as biomarkers for identifying botanical and geographic authentication. Numerous volatile components from a variety of botanical sources and geographical origins in mainly mono-floral honeys are listed. The authors note that the heating step to isolate the volatile components is the main disadvantage of this method as non-characteristic compounds are introduced into the samples. However, the technique is solvent free, inexpensive, rapid, and simple. They conclude that SPME-GC-MS based on volatile fraction was proved to provide reliable results to determine authenticity of honey in terms of botanical and geographic origin. It is not clear how effective this approach would be when challenged with multi-floral honey types. IR and Raman spectroscopies were determined to be suitable for the evaluation of botanical and geographical origin of honey when combined with chemometric analysis. The multi-spectroscopic approach using complementary techniques, and data interpretation through chemometric assessment, is required for the continued development and understanding of methodology related to honey authenticity.
Fluorescence spectroscopy in conjunction with parallel factor analysis (PARAFAC), PCA and SIMCA (soft independent modelling of class analogy) was used for the development of geographic and botanical discrimination models to differentiate among distinct honey classes (Ramona-Crina et al., 2022). Honey samples (n=96) from seven botanical sources and two geographical origins (France and Romania) were studied. The study proved the efficiency of the association between EEM fluorescence spectroscopy in conjunction with chemometric methods for honey discrimination according to geographical and botanical origin. A differentiation group of 95.8% was achieved for geographic origin and 94.5% for botanical varieties. No meaningful distinctions were recorded for the honeydew, linden and acacia honeys produced in the two countries. It was also noted that differentiations for colza and sunflower honeys might be caused by the distinct agricultural practice from Romania and France.
Honey samples (n=1040) from 6 different botanical origins were collected from 4 different locations within Indonesia and analysed by ultraviolet (UV) spectroscopy to determine authenticity (Suhandy & Yulia, 2021). A SIMCA classification method was applied to the data. This technique demonstrated to be a simple and low-cost analytical method for authenticating Indonesian honey for botanical, entomological and geographic origin. This study was limited in terms of geographic and botanical origin capability, and it is questionable as to the suitability of this methodology for larger and more complex datasets.
Various molecular and atomic spectroscopic techniques (1H NMR, portable NIR, benchtop NIR, Fourier transform infrared spectroscopy on the middle infrared region (ATR-FTIR-MIR), and Function-as-a-Service (FAAS) were used to characterize and discriminate Brazilian Canephora coffees of specific producers, including two with geographical indication, and also to differentiate them from the Arabica (Baqueta et al., 2023). The main objective was to compare and evaluate the feasibility of discrimination of different analytical techniques to determine a role in real time applications. The sample set comprised 100 Canephora samples of different geographical origins in Brazil (Conilon from Espírito Santo, Amazonian Robusta from indigenous and non-indigenous producers of Rondônia, and Conilon from Bahia) and Arabica coffee (25 samples). The authors concluded that although there was a contribution from all the different techniques for characterisation, the multi-block discrimination showed that NIR spectroscopy dominated for this purpose. Due to this finding, comparisons between benchtop NIR and portable NIR were tested. Portable NIR provided slightly inferior results to benchtop NIR through variable selection.
Given that benchtop FT-NIR is a highly accessible and novel rapid screening technology, work is underway by IAEA to develop methods for the geographical discrimination of green coffee from Costa Rica and of crude palm oil from Malaysia, and the verification of authenticity of organic strawberries. A training programme with support Member State efforts to improve their food safety and authenticity control systems and raising awareness of this highly accessible technology.
Spectroscopic techniques (NIR, MIR, Raman and UV-vis) were reviewed for rapid compositional analysis, authenticity, and traceability in beer and wine (Chapman et al., 2019). They record that all these techniques are rapid and easy to set up with very little or no sample pre-treatment. However, there are drawbacks such as requirements for application specific calibrations and overlapping signals. They concluded that these analytical platforms provide new opportunities for compositional analysis of beer and wine samples as well as help with authenticity, regional discrimination, and traceability. There are issues to resolve such as the availability of data mining tools, spectroscopic databases and difficulties combining the methods when large datasets are generated. There is also limited knowledge of what molecular changes take place during production of beer and wine. The paper concludes that future developments in vibrational spectroscopy will require a new approach in the analysis and interpretation of the data generated. The application of these techniques will determine that food science analytics is moving away from data being a discreet range of numbers reflecting what has happened, to more dynamic space where data mining, modelling, and big data together will provide relevant information that can be revealed to better understand complex interrelationships, processes, and functionality.
A study to review the application of handheld and portable spectroscopy-based devices, for the determination of food authenticity monitoring and traceability verification was conducted by McVey et al. (2021). Specifically, NIR, MIR, Raman, Vis-NIR, Visible and hyperspectral imaging (HIS) were reviewed as established technologies in this area. The largely commercially available NIR spectroscopy are miniaturised handheld devices with a low cost. These devices have a good classification capability when the resulting data is analysed by with chemometric tools. MIR are portable systems that are limited in size due to moving parts and detector requirements. These factors also make them less cost effective and can compromise performance of hand-held versions. However, they do have advantages over NIR due to the ability to identify and characterise structure and isolate mixtures for quantification. The Raman devices are a favourite technology for development in food adulteration due to low cost, long life, portability, and high sensitivity. Although negatively impacted by noise and other interfering signals, they are rapid in their experiment time. The development of portable devices in the Vis wavelength region has proven successful due to the availability of high performing, low-cost photodetector arrays. Portable, handheld Vis and Vis-NIR devices share many advantages seen in NIR as they are cost effective, non-destructive, rapid and can be used for chemical characterisation of foods. Although these devices are not as established as the NIR and MIR technologies they have shown promising developments in this area. HIS is emerging as a powerful tool for food authenticity analysis. Collecting both spectral and spatial information from a sample allows HIS to characterise complex heterogeneous mixtures as well as identifying surface and sub-surface constituents. This technique is the most flexible as it can analyse numerous samples at the same time. However, the cost for these instruments is high.
White wine samples (n=180) from 3 north-eastern Italian varieties were analysed by Surface Enhanced Raman Spectroscopy to determine their classification (Zanuttin et al., 2019). Using the ratios of three chemical characteristics, discrimination could be made between different wine varieties and producers. The results show that the quality features of these wines are due to local environmental characteristics and wine making processes related to the winery. The portable nature of this methodology is an advantage but proof of the concept on a larger, more complex dataset is required for it to be utilised as a global authentication tool.
Laser-induced Breakdown Spectroscopy has also been demonstrated as a promising emerging technology for the geographical origin of commodities such as olive oil (Gazeli et al., 2020; Gyftokostas et al., 2020).
3.6.7. Other techniques
Samples of dark chocolate originating from Africa (n=15), Asia (n=11) and South America (n=31) were analysed by Flow Infusion-Electrospray Ionization- Mass Spectrometry (FI-ESI-MS) to assess the geographical origin of cocoa beans (Acierno et al., 2018). The results of chemometric assessment confirmed separation between African and Asian origin chocolate but no clear trend for South America. The inability to separate all three continents was linked to brand related factors such as formulations, climate growing conditions and industrial processing.
Absorbance-transmission and fluorescence excitation-emission matrix (A-TEEM) was used for the first time to discriminate Shiraz wines (n=186) from sub-regional level within the Barossa and Eden Valleys (Ranaweera et al., 2023). Clear vintage variations were seen between the samples. A 98.4% accuracy in classification was achieved when modelling both regional and vintage classifications. The method would require further validation against a larger dataset to test and improve this model.
3.6.8. Blockchain
Several papers and review articles have been published which assess the use of digital ledger Blockchains for traceability and monitoring of food products from farm to shelf. Blockchain technology was studied by Peng et al. (2022) as a mechanism for collection of data in the rice supply chain. They concluded that theoretical verification of this emerging technology for resource sharing in the food supply chain provides digital ideas transferable to the food industry.
Blockchain, electronic tagging, and the combination of these traceability efforts are relevant for growing number of specifications. Emerging information and technology play a vital role in electronic agriculture (Saurabh & Dey, 2021). The technology adoption factors for grape wine supply chain using conjoint analysis were studied and presented. When designing a modular information system architecture for grape wine supply using Blockchain technology, Saurabh and Dey found that perceived disintermediation was the most important factor for users followed by traceability and then price. The direction of their future research is to expand to a system for deep traceability of bottled wine to wineries and vines.
A traceability prototype system was established to verify a case analysis of grain and oil traceability (J. P. Xu et al., 2022). A whole chain of grain and oil quality and safety was constructed, and a classification table of key information was built. The traceability system proposed had the advantages of data tamper resistance through hash encryption, data traceability through multiple trade links and data sharing through nodes within the network capable of receiving and sending message to maintain the ledger. They found that further theoretical, practical, and quantitative indicators are required to study the benefits of Blockchain technology based on consumer demand for complete and correct information about the goods they buy.
Lavazza, an Italian roast coffee company, describe their case study (Gazzola et al., 2023) introducing a Blockchain-tracked product to the market. Sensors were set up at the farm to monitor air and soil conditions. However, this was problematic due to reliance on continuous cellular connectivity. A user interface was created to detail the growing, harvesting, drying, quality, transportation, and roasting stages. They found that the strengths of the blockchain included data integration, tangible traceability and transparency. Whereas the weaknesses were manual applications for data input, limited to one batch ID at a time, high costs, difficult to escalate and possible falsification. The paper offers an example of how blockchain technology can be applied to increase traceability for stakeholders while maintaining transparency for consumers. However, this analysis aimed to be illustrative rather than definitive and only considered a single case study.
In a magazine article summarising the advantages and disadvantages of blockchain software in food supply (McKenzie, 2018) the author cited examples of software development by Walmart and IBM, aimed at addressing food safety and food traceability issues. Blockchain has the potential to support food safety. Almost 28 million people fall ill due to foodborne illness each year (Center for Disease Control and Prevention, USA). Should it become easily accessible to trace the supply chain for individual products linked to a food safety issue, contamination could be quickly identified and contained and products could be removed from the supply chain at a faster rate, potentially protecting consumer health, reducing the costs linked to food recalls and increasing consumer trust in certain products. A second area in which blockchain could provide protection relates to food fraud. Blockchain has the potential to protect food supply chains by increasing transparency and accessibility to companies. In addition to providing a manner to quickly trace products and their ingredients, blockchain may provide customers with more confidence in the authenticity of premium products and increase trust and thus increase spending on these products.
As a proof-of-principle example, Walmart asked their team to trace a package of sliced mangoes from their US stores back to their source, a 30-day chain involving 16 farms, two packing houses, three brokers, two import warehouses and one processing facility. This tracking process took the team six days, 18 hours and 26 minutes, a task completed by tracing food safety audits and certificates. Walmart then used the software developed by IBM and tracked the mangoes in only 2.2 seconds.
While this scenario shows impressive time savings due to blockchain software, Mitchell Weinberg, founder of food fraud protection and prevention firm Inscatech, argues that it is unrealistic to expect that all food can be tracked by blockchain, citing the example that, for a pack of spices, blockchain cannot determine whether the correct herb or spice was harvested in the first instance. Walmart acknowledge that tracking the entire list of ingredients which are contained within a single product is complex, due to the fragmentation of the food supply system, with much lack of digitisation and reliance on paperwork. While blockchain can be accurately implemented to track money by creating an accounting trail, Weinberg cites two major impediments to the success of blockchain as a food traceability solution. The first is that it requires for every step of the supply chain to be included in the data. The second is that it requires honest participation, in terms of, for example, correctly declaring the commodity, the authenticity of the handling party and where they are located. At present, there is no means of validating that information logged is genuine.
IBM acknowledge that data will not necessarily be guaranteed to be more accurate due to concerns over dishonesty and incorrect information declaration. However, the company hopes that the implementation of blockchain will improve data quality over time by removing the anonymity of those entering the data which may disinhibit unscrupulous behaviours. Making data more accessible should also mean that the discovery of both accidental and deliberate errors is facilitated. However, the notion that removal of anonymity will result in less unscrupulous behaviour is disputed by Weinberg, arguing that those adulterating food are sophisticated criminals who make much effort to cover their crime. Also, if information such as what is being harvested is not the declared commodity, there is no way of identifying this once incorrect source data are entered into blockchain. Therefore, while blockchain technology may facilitate efficiency in supply, such as quickly identifying delays and where efficiencies in supply can be made, the human element required to generate the data may mean that blockchain is unsuitable for identifying unscrupulous activity in food supply.
In conclusion, although there is the possibility to address the demand for tangible and transparent traceability systems throughout food production systems, there are flaws to Blockchain systems. While blockchain could aid paperchain traceability in a weight-of-evidence approach, a common opinion is that implementation costs of these systems are high. This is mainly due to required investment in the manual setup of the technology at a variety of different and complex stages within the food production process. Furthermore, there are various supply chains from raw ingredients to final product vendors within the chain. Another common concern is that much of the data is entered manually and is therefore subject to possible falsification and vulnerable to security attacks.
3.6.9. CEN activities, AOAC/OIV methods and other collaborative opportunities relevant to country of origin determination
Activities for food and liquid commodities by the European Committee for Standardisation (CEN), and for wine by the Association of Analytical Chemists (AOAC) and the International Organization of Vine and Wine (OIV), have been underway to facilitate the standardisation of methods to determine geographical origin. Unfortunately, standardisation activities can take much time.
The European Committee for Standardisation (Food Authenticity Working Group, WG6) has focused, since 2020, on the standardisation of methods for stable isotope analysis (reference CEN/TC460) in the following areas:
- Determination of C and/or N isotope ratios in food by Elemental analyser -- Isotope Ratio Mass Spectrometry (EA-IRMS)
- Determination of 18O/16O isotope ratios in liquid aqueous food matrices by Equilibration -- Isotope Ratio Mass Spectrometry (Eq-IRMS)
- Sample preparation for isotope ratio analysis of the different fractions of fruit and vegetable juices and related products
These methods include sample preparation, usage of standards, normalisation and processing of raw data, which are vital aspects for comparability of data sets from different laboratories and consequently country of origin determination.
The finalisation of the standards will be accompanied by inter-laboratory validations which are planned by the Food Authenticity Working Group in the near future.
AOAC INTERNATIONAL’s Official Methods of Analysis℠ program is the organisation’s premier program for consensus method development. Methods approved in this program have undergone rigorous scientific and systematic scrutiny and are deemed to be highly credible and defensible. Concerning wine, Table 5, produced by the Association of Analytical Chemists (AOAC) and the International Organization of Vine and Wine (OIV), summarises the validated and officially approved analytical methods for wine using SIRA and 2H-NMR for the determination of stable isotope ratios in wine constituents (Christoph et al., 2015). The methods are also published in the “Compendium of International Methods of Wine and Must Analysis” of the International Organization of Vine and Wine (OIV). These are used for proof of adulteration in cases of chaptalisation, water and sugar addition, vintage but also for mislabelling of geographical origin.
Finally, other collaborative opportunities could be capitalised on to progress origin verification methods. Twenty-three countries are currently involved on setting up and implementing databases, alongside initiatives from NASIR (National Association of Security and Investigative Regulators, Germany) and NIST (USA). Should the UK become involved, there would be opportunity to share these data. There is also opportunity to link up with Commonwealth countries to share data to expand on previous UK geographical origin projects to re-visit and expand on the data UK hold on beef, salmon and wine.
In order to form and strengthen collaborations and data sharing, with an aim to provide origin databases scaling large areas, the IAEA is supporting countries (from low to high premium member states) for geographical origin database initiatives, providing training to standardise sampling strategies, sample analyses, data evaluation and leading to routine testing.
Outside of food, the IAEA functions as a third-party auditor, testing 5% of the total samples collected by World Forest ID activities to cross-check data to support timber origin. World Forest ID initially focussed on timber, holding centralised data, which gets updated continuously by accredited laboratories and are also expanding to other commodities, for example to verify that soya and palm oil have not been grown in areas cleared for this purpose by unpermitted deforestation and on mapping shrimp origin. More information relating to timber origin methodologies, showing an example of a successful working system to trace geographical origin, is included in stakeholder interview.
Finally, there is a future requirement for the standardisation of statistical analysis to prepare data for geographical origin elucidation.
3.6.10. Outlook for emerging technologies
It is envisaged that further development will be made by manufacturers of handheld and portable devices to offer increased sensitivity, selectivity, and availability. These devices have been shown to be a proof of concept for the rapid determination of geographic origin on local levels. As the technology develops the expertise in operation and interpretation will require development to facilitate the building and understanding of larger more robust datasets on a global scale. A final consideration for these technologies as they develop will be the accreditation and standardisation of the methodologies.
Digital leger blockchain technology could be a useful tool for geographic origin traceability of farm produce and processed foods. Large amounts of information can be stored in one place. However, the success of these systems is reliant on the co-operation of every member of the supply chain to invest in and apply the technology which could be a challenge. For this technology to be applied in field, issues around security and integrity will need to be addressed.
3.7. Summary of most promising techniques per commodity
The progress of various technologies in attributing geographical origin has been discussed in detail above. While emerging technologies may come to the fore in the future, from this information, the list below summarises the methods shown to be most developed and showing the most demonstrated capability and potential.
-
Cereals: Relatively little research has been completed for this commodity type and it may be worth investigating with the combination of SIRA and trace element analysis.
-
Cocoa: A common theme for cocoa origin determination is a need to transition to more portable, rapid, affordable and non-destructive analytical approaches to support testing in remote areas. Spectroscopy techniques show great promise, combining NIR and sensory techniques with AI.
-
Coffee: SIRA analysis, particularly for oxygen isotope ratios, in combination with trace element analysis.
-
Fish and Seafood: Gaining analytical data relating to the geographical origin of commodities cultivated in the marine environment is particularly challenging. Trace element profiles vary greatly due to a combination of natural and anthropogenic activities and depending on harvest time. It is therefore likely that the combination of several technologies will provide the most informative models overall, including trace element analysis, NIR and REIMS study of lipid markers. Due to the wide range of varying factors, it will be imperative that databases are constantly updated to account for these variations.
-
Fruit juice: Trace element analysis is the most mature analysis which has shown promise.
-
Garlic: The research for this commodity is immature, comprising a small dataset over a restricted geographical range. Methods to consider include combining trace element, volatile compounds and metabolite data.
-
Honey: Pollen analysis using light microscopy can be used to determine the relative amounts of pollen contributed to honey by different plant species. However, this requires high levels of skill and expertise from a limited number of analysts who are currently able to determine geographical origin based on this method. The study of volatile compounds, sugars, organic acids and amino acids has been used to differentiate between floral type of honey. SIRA, particularly of H, O and C isotopes shows promise, and trace element analysis can add value in terms of determining environmental factors, such as trace elements in water sources, which relate to where the pollen originated. Metabolomic and genomic approaches in combination with blockchain provide current state-of-the-art technologies. These could be coupled with analysis of volatile compounds. Limitations relating to the low copy numbers which in the past have challenged the success of digital PCR are starting to be overcome and this technology may also be of value in the future.
-
Meat: Livestock can be moved across geographical origins during their lifetime and dietary background (which can vary due to geography and also due to feed type and origin). SIRA (H, C, N, S and O) is a well-established technique to infer geographical origin with carbon isotopes beneficial to discriminate diet and deuterium and oxygen to support the discrimination of geographical origin. Combining SIRA with trace element analysis, and also considering fatty acid profiling can improve the confidence of the data, along with RFID to monitor livestock movement.
-
Olive oil and other edible oils: Since geographical origin is a major source of the variation in oil composition, a multivariate approach seems to offer the most potential. SIRA technology (particularly for C, H and O) has shown great promise for this commodity group, and it is suggested that this could be combined with NMR and profiling of phenolic compounds, fatty acid profile (including FAMEs), sterols, triacylglycerol (TAGs), volatile compounds and colour. The additional inclusion of trace element data could be considered but the impact of fertilisers and fungicide on trace element levels must be understood. The potential of FTIR ATR should also be considered.
-
Rice: A combination of SIRA (particularly C, H and O) with trace element analysis seems the most promising application.
-
Saffron: There has been insufficient work on this commodity to conclude on the most promising method, but SNP arrays and GBS appear to be worth investigating further.
-
Tea: Trace element profiling is the most mature method that has offered the most promise.
-
Tomato: Studies are limited but trace element profiling appears to show promise, for example Li/Cu, Co/Rb and Sr/Cd ratios.
-
Vanilla: Much more research is required here across various technologies. It appears that genomics technologies are not applicable due to the absence of genetic variation.
-
Whisky: There has been much focus on GC, LC and spectroscopy, and it appears that trace element analysis could add to a weight-of-evidence approach.
-
Wine: SIRA and SNIF-NMR methodologies are the most mature and are applicable, as exemplified by their application in the EU Wine Databank. These could be combined with trace element profiling to compare the trace element of the wine with important components of the soil, especially Sr and Pb. However, the impact of processing must be understood including the contribution of Cu and Al from the fermentation vats. The impact of trace elements from fining agents, agricultural practices and pollution must also be considered.
3.8. Challenges and limitations to geographical origin determination
This literature review clearly highlights many challenges remaining in the determination of geographical origin across all technologies. Firstly, there is a lack of available databases, with databases not existing, not available for public access or not incorporating a sufficient scope of samples. Keeping databases up-to-date with regular inclusion of contemporary samples will allow for natural and seasonal variation so that the methods are ready for use to respond to issues as soon as they arise. Without regular maintenance, delays in testing will be incurred, the length of which will depend on the level of variation which has occurred in the interim period.
In order for methods to be considered as ready for use, methods must be standardised and validated and must have undergone an inter-laboratory validation. There has been a lack of reference materials and calibration standards (Camin et al., 2017) used during the determination of geographical origin. The AOAC/OIV have published methods for the determination of the geographical origin of wine which have undergone rigorous scientific and systematic scrutiny and are deemed to be highly credible and defensible. Work is underway by the European Committee for Standardisation’s Food Authenticity Working Group to standardise testing of the geographical origin of solid food matrices (raw or processed) and also, separately, liquid matrices, with inter-laboratory trials planned for the future. Reference materials for these methods being considered by CEN are available for purchase and then necessitate intra-laboratory validation by each individual user laboratory. Once this work is complete, proficiency testing schemes could be co-ordinated in the future to support laboratories in achieving the quality of analysis demanded of them by their customers, accreditation bodies and managers.
Data in Appendix 3 show the cost to purchase and run a range of the main technologies currently applied to address geographical origin analyses, including time requirements for training. This is to inform on investments required to support official control in the future. As shown in this appendix, most of the technologies involved require significant investment to purchase and run the equipment and require high levels of training. The estimated charge per sample assumes that a sufficient number of samples are analysed by the required method each year to cover the costs of method accreditation including participation in any future proficiency testing rounds, instrument maintenance, annual servicing and also the costs of gases where applicable. It is therefore recommended that it may be most cost effective for Official Control Laboratories to sub-contract the analyses involving the instruments which they do not routinely own rather than investing in new instruments and training. The technology which requires the lowest cost is the portable NIR which requires much less investment and training and produces data rapidly.
4. Stakeholder engagement activities
In order to capture information and views from stakeholders relating to the geographical origin of food, and to better-understand the current outlook regarding geographical origin verification of food and feed, stakeholders from enforcement, geographical origin testing laboratories, representatives of food traceability networks and trade bodies, independent experts in food authenticity, those working in database and technology development along with stakeholders in the supply chain were contacted either to be interviewed or to complete a questionnaire in relation to their involvement in geographical origin verification and to determine potential future actions to support geographical origin. In addition to these stakeholders, an independent consultant, Dr Simon Kelly was contacted, alongside an organisation which is running a successful working model of traceability of food and timber commodities (World Forest ID). Dr Kelly is knowledgeable in origin verification, particularly specialising in SIRA. The outputs of this stakeholder engagement are included in Appendix 4.
World Forest ID provided an example of a working model for verification of geographical origin, albeit mainly for timber, with a more minor element for food and feed commodities. This model has a large dataset, dispersed over an almost global area, the methods of collecting samples and analysing data have been standardised and the database is curated centrally. The datasets are constantly expanded, allowing for variation according to commodity type (including tree species), season and year.
4.1 Summary of key aspects highlighted by stakeholders
-
It is understood across stakeholders that there is no ‘silver bullet’ technology to determine the accurate origin of food and feed commodities and that testing data need to be checked to determine if they are consistent or not with the declared origin in order to make a judgement regarding the accuracy of food labels or paper trails. While SIRA and trace element testing are the most commonly used technologies across commodity types, stakeholders acknowledged that testing using other technologies can add weight-of-evidence.
-
Geographical origin of food is not a priority testing area. Food safety is the main focus, followed by species identification so research and capabilities for origin verification are lagging behind. Origin verification tends to be a question for those higher in the supply chain who tend to be the parties who instigate testing and the point was raised that it is difficult to engage with those involved earlier in the supply chain as origin questions are regarded as a problem rather than as a solution to food authenticity and value.
-
Traceability documents are an easy target for counterfeit activities. Large companies tend to have a simpler supply chain than smaller companies, as well as technical staff to support traceability. Traceability is therefore more difficult for smaller companies, which would benefit from more support in this area.
-
SIRA and trace element analysis are the main methodologies used to verify origin, often in combination. These technologies provide data which can be used to determine whether this aligns or not with the declared origin of the food, but don’t provide unequivocal origin determination.
-
The main analytical challenge highlighted was the need for extensive and robust databases, with large numbers of authentic samples and being updated continuously. Almost all stakeholders highlighted this.
-
Issues with databases:
-
Databases need to be representative of seasonal and annual variations, including taking account of climate change impacts,
-
Databases should be built with assured authentic samples, collected and tested against standardised protocols to provide robust databases, with no chance that fraudulent samples could be included.
-
Metadata should be recorded to assist data interpretation, e.g., extreme temperatures in a specific year / season.
-
Databases should be challenged regularly using authentic and non-authentic samples.
-
Datasets are usually owned by private companies and not publicly available. This is due to the large investments required to build databases and the commercial value that they have.
-
Sharing of data and transparency about numbers / types of samples and uncertainty associated with specific testing methods / databases / data analysis interpretation were flagged as areas for improvement.
-
Currently, many UK testing customers access databases which hold only UK data and therefore verification of origin of non-UK produce is a challenge. The point was raised that it would be ideal for the databases to be curated centrally with public access and that the lack of transparency from database holders relating to database design and data interpretation was a challenge to food origin verification.
-
Due to the lack of transparency regarding database design and composition, points were raised that, when a customer receives an unexpected test result relating to origin declarations, they can approach an alternative provider and receive an alternative outcome without understanding which test may provide the more representative result for their query.
-
-
Testing laboratories tend to produce their own reference materials, which usually have been subjected to inter-laboratory validation and have been produced from authentic samples.
-
There is agreement regarding the preference for accredited laboratories and methods, where available. Also, it would be beneficial if laboratories supported the initiation of proficiency testing or similar, in which they could participate to demonstrate their competency in methods.
5. HorizonScan™ data relating to geographical origin
Data were sourced to inform on the number of global incidents relating to geographical origin issues in food and feed over the last decade. HorizonScanTM provides food safety professionals with a comprehensive, scientific basis for assessing supply chain risks. The tool is a web-based information service with a suite of proprietary risk assessment and supplier check tools tracking current and historical global food fraud and contamination issues in near real time which are reported via official means. HorizonScanTM collects daily data for 550 commodities, from over 110 food safety agencies and 180 countries. Data were taken from HorizonScanTM to detail incidents recorded relating to geographical origin issues over the last ten years. The data are shown in Figures 2 and 3.
As shown in Figure 2, the commodities for which the largest numbers of incidents which have been notified over the last decade relating to geographical origin are honey and wine, followed by fish, meat, olive oil and food/dietary supplements. In Figure 2, each of the honey notifications relate to honey of incorrectly declared country of origin and often also of (related) incorrectly declared botanical origin in the same product notifications. Along with incorrect origin declarations for a given product, some notifications also include adulteration of these honeys with undeclared or unauthorised dyes and sweeteners. The wine notifications in Figure 2 relate to the grapes having been harvested in a different country of origin to that declared and the fish notifications relate to incorrect or insufficient information regarding country of origin.
From the HorizonScan™ data, 68% of the notifications for exports from Czech Republic (Figure 3) relate to honey being of the incorrect geographical origin (relating to incorrect botanical origin) plus issues in the same honeys relating to addition of colourings and sweeteners. These notifications were all reported in the period January 2015 to December 2023. Relating to wine exported in Czech Republic, 27% of the notifications relate to the grapes having been harvested in a different country of origin to that declared.
Of the issues shown in Figure 3 for exports from Slovak Republic, 85% of these relate to wines containing grapes harvested from an alternative country of origin to that declared. Concerning the high number of issues shown on Figure 3 relating to unknown exporting nations (exporting nation not declared in Rapid Alert System for Food and Feed (RASFF) notification), these data relate to a wide range of commodities of unknown origin. The issues relating to Spain are also for a range of different commodities.
In addition to this officially recorded data, issues were captured in the grey literature (trade journals) relating to the faking of PGI balsamic vinegar of Modena (prepared with suspected lower-grade grapes)[25], falsification relating to the vintage and origin of wine in New Zealand[26] and concerns for the geographical origin of meat labelled as British[27].
6. Conclusions and Challenges: Review of methods for the verification of geographical origin of food and feed
There is no unequivocal single technique with which to verify the country of origin of food and feed. Testing methods do not recognise country borders and there can be wide variations in data profiles across countries. Technologies can be applied but, rather than the data providing an absolute geographical origin, the data must be evaluated to determine whether they are consistent or not with the declared origin of a product.
The maturity of the testing for origin varies greatly between commodities and geographical locations. The validity of studies undertaken to date is highly dependent on the number and nature of the samples used as a reference collection. In many studies this is inadequate to reach firm conclusions and there is also doubt over the robustness of sampling plans in published studies. Obtaining authentic reference materials is a critical problem. While combined SIRA and trace element analyses are the most widely used technologies and often provide the most pertinent data for provenance especially in combination also with Strontium analysis, depending on the commodity and scenario, other technologies such as genomics and spectroscopy are applicable or can add value to SIRA and TE data.
Using multiple data sets from different analytical sources generally improves classification rates and therefore the robustness of origin determination. Data fusion is an effective strategy for improving the classification performance. There is scope to investigate the inclusion of other data such as that from contaminants such as pesticides, dioxins etc to improve the outcomes of testing methods. If successful, applying multivariate data in this way provides a large suite of evidence to substantiate testing results. A point to note is that using multivariate methods may require more frequent monitoring compared to using, for example, SIRA and TE data only.
There are barriers to the implementation of methods for geographical origin determination. The main challenges relate to quality, size, geographical range and curation of the databases. The lack of continuous expansion, and therefore lack of current relevance in terms of seasonal and annual natural variation, size and robustness of databases used for verification of origin, is discussed throughout this report.
Datasets are fragmented, with small datasets available for certain constrained locations, for example neighbouring regions of one country with data generated for a single food commodity. Data sharing is poor. Due to investment and IP concerns, data tend to be held in private databases and the UK can mostly access UK data. This makes the origin verification of imports challenging. Before robust geographical origin testing methods can be achieved, large representative databases containing authentic samples must be prepared, incorporating high numbers of samples for which datasets span large areas of the globe. The number of samples and extent of geographical range should be stated to facilitate accurate interpretation and uncertainty judgements of results. Investment is required since databases must be continuously expanded to incorporate natural season-to-season and annual variability. Increasing the number of samples and the geographical area within a dataset would add confidence to the outputs and would influence on the quality of the prediction models, to the benefit of all users. The involvement in the food industry in providing samples would increase the representation of natural variability within datasets.
Databases must be continuously expanded to account for season-to-season and annual variation. They should be challenged with new/additional samples to ensure continued relevance.
Ideally, in the future, databases would be curated centrally, an approach successfully adopted for the provenance of timber. Data should only be accepted into the database one validity has been demonstrated by use of accepted/standardised methods for collection and analysis of authentic samples. Standardisation activities are under development by AOAC, CEN, CODEX, however this takes time. Transferability of data from one lab to another must be demonstrated by proficiency trials.
Other barriers to geographical origin verification relate to:
-
Costs – the main methods used require mass spectrometric analysis which requires high-cost instruments and skilled analysts.
-
Contacts are required to source authentic samples in foreign countries in a standardised manner, to build and expand databanks of analytical data.
-
Lack of food-specific matrix-matched reference materials for SIRA.
-
Demand for proficiency testing is lacking for geographical origin determination and this is impeding the trustworthiness of the data.
-
Requirement for future standardisation of statistics applied to geographical origin data.
-
There is a challenge to present multivariate data in court. Only limited methods have been considered ready to be used in court, for example single isotope analysis data for wine and for German butter trade.
-
IP considerations of databases and related accessibility.
-
Due to the nature of the methods, should a product be adulterated and contain a commodity from a mixture of origins, this is problematic for origin verification and the achievable limits of testing for adulterated samples must be understood.
-
Few methods to verify origin are accredited and the accreditation process can take twelve months. Means to fast-track methods for accreditation are needed when a method is specifically developed to address and investigate a known issue in the supply chain.
The confounding issues relating to meat origin mainly concern the movement of animals and meat products during rearing and supply, which, from an analytical perspective, makes it very difficult to identify the source of meat products. Analytical methods for the detection of meat origin also need to consider different types and geographical sources of animal feed which can impact on the chemical profile. Routine testing for origin and the establishment of databases for meat authenticity will fail if they do not account for feeding regimes and the movement of animals.
In the cases for which there are many variables which contribute to the authenticity of a product, for example wine, an initial critical step must be undertaken to ensure that there is good information in relation to geographical origin and any compounding factors such as vintage, to ensure that robust metadata are collected.
While the most substantial approach of data/techniques for country of origin determination to date has involved the combination of SIRA, trace element and chemometrics methodologies which have been used for the long standing application for PDO/PGI foods, there is scope for the emergence of other methods, either for initial, fast screening or more detailed analysis. While hand-held spectroscopy technologies have been used in small proof-of-principle studies, multivariate analysis may be considered as a natural next step in the quest to verify origin, fusing data generated by genomics (including environmental genomics), isoscapes (for example hydrology) and metabolomics. In the future, it would also be interesting to combine other existing datasets to determine their value in verifying geographical origin. Datasets which could provide valuable information include environmental data, pesticide residue data, dioxin and polychlorinated biphenyl levels (particularly relevant to foods including salmon), per- and polyfluoroalkyl substances (PFAS), viral screen data, fungi species presence. Other existing data which have not been applied to origin determination, such as various data held by the British Geological Survey, could be incorporated into datasets.
While many of the larger members of the food supply chain are aware of the importance of accurate origin labelling and perform testing to challenge label information, there is opportunity to engage and educate parties at the lower end of the supply chain in order to mitigate subsequent issues. Projects at IAEA such as the Implementation of Nuclear Techniques for Authentication of Foods with High-Value Labelling Claims (INTACT Food) project are going some way to filling this knowledge gap.
Finally, there are opportunities for public, private and inter-nation funding to improve methods in geographical provenance which could be capitalised upon to protect our supply chain.
6.1. Future Direction
Based on the information gathered in this project, key areas of future work which we recommend are detailed below:
-
All stakeholders highlighted challenges with a lack of quality data or up-to-date data in the various databases which have been prepared over the years in support of food provenance testing. In the future, investment could be made to support the building of robust datasets, necessitating the harmonised collection and analysis of high number of authentic samples from a large (e.g. global) geographical area. Efforts should focus on commodities which are vulnerable to origin fraud.
-
These datasets should be curated in a centralised hub to facilitate harmonisation and the datasets should be regularly expanded with samples to account for seasonal and annual variation and so that the models are ready for immediate use to respond to issues in the supply chain. The hub should be funded so as to allow continued expansion and curation of the datasets over time. The suitability of the datasets should be challenged each year by the testing of additional samples (i.e. not the samples included in the prediction models) and all testing facilities involved should take part in proficiency trials and achieve acceptable data, demonstrated through statistical measures, for example z-scores.
-
Few methods to verify origin are accredited and the accreditation process can take twelve months. Means to fast-track methods for accreditation are needed when a method is specifically developed to address and investigate a known issue in the supply chain.
-
Relevant matrix-matched certified reference materials should be made available.
-
Demand for proficiency testing in origin analyses should be encouraged and certified testing schemes should be initiated. Currently no proficiency testing schemes are detailed for geographical origin on the global EPTIS database.
-
There is much existing data from other testing exercises which are not yet/rarely used in origin verification, and which could have value for origin determination. This could be explored. Investment could be made to initiate the collation of these existing data and metadata and investigate its utility for origin verification purposes. Prediction models could be built to determine the relevance of these data for addressing origin issues. Such existing data could include environmental and geological data, pesticide residue data, dioxin and polychlorinated biphenyl levels (particularly relevant to foods including salmon), viral screen data and fungi species presence information.
-
Government funding and private and EU funding opportunities in origin determination could be considered to expand collaborations between those working on building datasets for a commodity (or range of commodities) across different geographical locations and, in turn, to expand the geographical range of datasets.
7. Acknowledgements
Fera Science gratefully acknowledges the joint funding from Food Standards Agency and Defra to conduct this work. We are indebted to all stakeholders who engaged with this project via interviews and completion of questionnaires to inform on issues relating to the current status of geographical origin of food. We also gratefully acknowledge Dr Simon Kelly who was a specialist consultant for the project, particularly specialising in SIRA.
This report has been prepared by Fera Science Limited (“Fera”) for the for the sole benefit of Food Standards Agency and Defra. This document, and all the information, images and intellectual property rights in it belong to Fera (or its licensees). No part of the text or graphics may be reproduced without the prior written permission of Fera. Except as otherwise advised in writing by Fera, this information is confidential in nature must be treated by the receiver with at least the degree of care that it applies to its own confidential information (and always with at least a reasonable standard of care).
Fera shall not be liable for any claims, losses, demands or damages of any kind whatsoever (whether such claims, losses, demands or damages were foreseeable, known or otherwise and whether direct, indirect or consequential) arising out of or in connection with: (i) any advice given by Fera or its representatives; and/or (ii) the preparation of any technical or scientific reports. Fera makes no representation as to the suitability of using any particular goods in any manufacturing processes or scientific research, nor as to their use in conjunction with any other materials. Fera shall not be liable for any reliance placed on, nor for any recommendations, interpretation, analysis, guidance, suggestions, proposals or endorsements made in connection with, the services and/or the commercial or scientific activities carried out by Fera or its representatives.
Commission Delegated Regulation (EU) No 664/2014 (eur-lex.europa.eu)
eAmbrosia - the EU geographical indications register (ec.europa.eu)
Guidance: Protected geographical food and drink names: UK GI schemes (gov.uk)
Country of origin of foods study published | food.gov.uk (nationalarchives.gov.uk)
Study: No Misleading Claims of Food’s Country Origins (foodingredientsfirst.com)
Ensuring the Integrity of the European food chain (cordis.europa.eu)
Metagenomics for determination of origin part 2 (full study) - FA0141 (sciencesearch.defra.gov.uk)
Chinese Meteorological Administration stations (data.cma.cn)
FS515009 (British Beef Origin Project 2) (foodstandards.gov.scot)
FS515009 (British Beef Origin Project 2) (foodstandards.gov.scot)
Geographical Differences of Trace Elements in Wines (resources.perkinelmer.com)
‘Fake balsamic vinegar’ scandal as Italy uncovers major fraud case (thegrocer.co.uk)
Tesco ‘foreign’ pork chop sold as British a one off, industry claims (thegrocer.co.uk)