Automatic Nutritional Information Extraction from Photographic Images of Labels

Lara Rafaela Almeida Marinha

Automatic Nutritional Information Extraction from Photographic Images of Labels

Detalhes bibliográficos
Autor(a) principal:	Lara Rafaela Almeida Marinha
Data de Publicação:	2015
Tipo de documento:	Dissertação
Idioma:	eng
Título da fonte:	Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
Texto Completo:	https://hdl.handle.net/10216/83493
Resumo:	In the past years people showed an increasing interest in improving their diet. Many factors can be pointed to this growth, being one of them the alarming explosion of diet related diseases. This group of diseases is progressively becoming the most common causes of death, including cardiovascular diseases, obesity, diabetes and cancer. Currently, almost all food products on the market contain nutrition labels, which is any information that appears on the product package referring to the values of the following nutrients: energy, proteins, carbohydrates, fats, dietary fiber, sodium, vitamins and minerals. This information provides a great insight of a product composition and helps the consumers to make healthier food choices. While the labels do not have a regulated or standard format, each product often presents the nutrition information differently, leading to a wide variety of nutrition labels present in the market. This, combined with the high amount of information displayed and the difficulty of interpreting the data without the necessary knowledge, makes the extraction of relevant data and analysis a hard task for consumers. One of the solutions to simplify this task suggested in many of the studies on this subject, is to present a summary of nutrition information as a complement to the nutrient-specific information. The main outcome of this project is to overcome this problem and offer the consumer a tool to help in the extraction and interpretation of these values, by offering to the consumer an Android application. This application tries to extract automatically the nutritional information of an image of a nutrition declaration and presents it in a single, cross-sectional shape, following the new regulations and with some additional aids, including relative values to the recommended daily doses and simplified schemes. In addition to this feature, it is also possible to compare between two products of the same category. In order to achieve these goals, it is necessary to convert the image into digital text to be processed later. To perform this conversion the application uses the OCR engine developed by Google, Tesseract. Many problems were found throughout the development of this project, such as the low accuracy of the OCR engine or the problems of acquiring the images using a mobile device. However, after some pre and post processing algorithms, the accuracy increased to 55%, 83% more than without any preprocessing. In addition, the percentage of images that returns 0 matches decreased from 30% to 8%.

Metadados do item

id	RCAP_776efdc64bfe3ce0b9e453698dc82e7e
oai_identifier_str	oai:repositorio-aberto.up.pt:10216/83493
network_acronym_str	RCAP
network_name_str	Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
repository_id_str	7160
spelling	Automatic Nutritional Information Extraction from Photographic Images of LabelsEngenharia electrotécnica, electrónica e informáticaElectrical engineering, Electronic engineering, Information engineeringIn the past years people showed an increasing interest in improving their diet. Many factors can be pointed to this growth, being one of them the alarming explosion of diet related diseases. This group of diseases is progressively becoming the most common causes of death, including cardiovascular diseases, obesity, diabetes and cancer. Currently, almost all food products on the market contain nutrition labels, which is any information that appears on the product package referring to the values of the following nutrients: energy, proteins, carbohydrates, fats, dietary fiber, sodium, vitamins and minerals. This information provides a great insight of a product composition and helps the consumers to make healthier food choices. While the labels do not have a regulated or standard format, each product often presents the nutrition information differently, leading to a wide variety of nutrition labels present in the market. This, combined with the high amount of information displayed and the difficulty of interpreting the data without the necessary knowledge, makes the extraction of relevant data and analysis a hard task for consumers. One of the solutions to simplify this task suggested in many of the studies on this subject, is to present a summary of nutrition information as a complement to the nutrient-specific information. The main outcome of this project is to overcome this problem and offer the consumer a tool to help in the extraction and interpretation of these values, by offering to the consumer an Android application. This application tries to extract automatically the nutritional information of an image of a nutrition declaration and presents it in a single, cross-sectional shape, following the new regulations and with some additional aids, including relative values to the recommended daily doses and simplified schemes. In addition to this feature, it is also possible to compare between two products of the same category. In order to achieve these goals, it is necessary to convert the image into digital text to be processed later. To perform this conversion the application uses the OCR engine developed by Google, Tesseract. Many problems were found throughout the development of this project, such as the low accuracy of the OCR engine or the problems of acquiring the images using a mobile device. However, after some pre and post processing algorithms, the accuracy increased to 55%, 83% more than without any preprocessing. In addition, the percentage of images that returns 0 matches decreased from 30% to 8%.2015-07-162015-07-16T00:00:00Zinfo:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/masterThesisapplication/pdfhttps://hdl.handle.net/10216/83493TID:201310651engLara Rafaela Almeida Marinhainfo:eu-repo/semantics/openAccessreponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãoinstacron:RCAAP2023-11-29T15:27:09Zoai:repositorio-aberto.up.pt:10216/83493Portal AgregadorONGhttps://www.rcaap.pt/oai/openaireopendoar:71602024-03-20T00:24:02.147987Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãofalse
dc.title.none.fl_str_mv	Automatic Nutritional Information Extraction from Photographic Images of Labels
title	Automatic Nutritional Information Extraction from Photographic Images of Labels
spellingShingle	Automatic Nutritional Information Extraction from Photographic Images of Labels Lara Rafaela Almeida Marinha Engenharia electrotécnica, electrónica e informática Electrical engineering, Electronic engineering, Information engineering
title_short	Automatic Nutritional Information Extraction from Photographic Images of Labels
title_full	Automatic Nutritional Information Extraction from Photographic Images of Labels
title_fullStr	Automatic Nutritional Information Extraction from Photographic Images of Labels
title_full_unstemmed	Automatic Nutritional Information Extraction from Photographic Images of Labels
title_sort	Automatic Nutritional Information Extraction from Photographic Images of Labels
author	Lara Rafaela Almeida Marinha
author_facet	Lara Rafaela Almeida Marinha
author_role	author
dc.contributor.author.fl_str_mv	Lara Rafaela Almeida Marinha
dc.subject.por.fl_str_mv	Engenharia electrotécnica, electrónica e informática Electrical engineering, Electronic engineering, Information engineering
topic	Engenharia electrotécnica, electrónica e informática Electrical engineering, Electronic engineering, Information engineering
description	In the past years people showed an increasing interest in improving their diet. Many factors can be pointed to this growth, being one of them the alarming explosion of diet related diseases. This group of diseases is progressively becoming the most common causes of death, including cardiovascular diseases, obesity, diabetes and cancer. Currently, almost all food products on the market contain nutrition labels, which is any information that appears on the product package referring to the values of the following nutrients: energy, proteins, carbohydrates, fats, dietary fiber, sodium, vitamins and minerals. This information provides a great insight of a product composition and helps the consumers to make healthier food choices. While the labels do not have a regulated or standard format, each product often presents the nutrition information differently, leading to a wide variety of nutrition labels present in the market. This, combined with the high amount of information displayed and the difficulty of interpreting the data without the necessary knowledge, makes the extraction of relevant data and analysis a hard task for consumers. One of the solutions to simplify this task suggested in many of the studies on this subject, is to present a summary of nutrition information as a complement to the nutrient-specific information. The main outcome of this project is to overcome this problem and offer the consumer a tool to help in the extraction and interpretation of these values, by offering to the consumer an Android application. This application tries to extract automatically the nutritional information of an image of a nutrition declaration and presents it in a single, cross-sectional shape, following the new regulations and with some additional aids, including relative values to the recommended daily doses and simplified schemes. In addition to this feature, it is also possible to compare between two products of the same category. In order to achieve these goals, it is necessary to convert the image into digital text to be processed later. To perform this conversion the application uses the OCR engine developed by Google, Tesseract. Many problems were found throughout the development of this project, such as the low accuracy of the OCR engine or the problems of acquiring the images using a mobile device. However, after some pre and post processing algorithms, the accuracy increased to 55%, 83% more than without any preprocessing. In addition, the percentage of images that returns 0 matches decreased from 30% to 8%.
publishDate	2015
dc.date.none.fl_str_mv	2015-07-16 2015-07-16T00:00:00Z
dc.type.status.fl_str_mv	info:eu-repo/semantics/publishedVersion
dc.type.driver.fl_str_mv	info:eu-repo/semantics/masterThesis
format	masterThesis
status_str	publishedVersion
dc.identifier.uri.fl_str_mv	https://hdl.handle.net/10216/83493 TID:201310651
url	https://hdl.handle.net/10216/83493
identifier_str_mv	TID:201310651
dc.language.iso.fl_str_mv	eng
language	eng
dc.rights.driver.fl_str_mv	info:eu-repo/semantics/openAccess
eu_rights_str_mv	openAccess
dc.format.none.fl_str_mv	application/pdf
dc.source.none.fl_str_mv	reponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação instacron:RCAAP
instname_str	Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
instacron_str	RCAAP
institution	RCAAP
reponame_str	Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
collection	Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
repository.name.fl_str_mv	Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
repository.mail.fl_str_mv
_version_	1799136155947499521

Automatic Nutritional Information Extraction from Photographic Images of Labels

Registros relacionados