Auditory image understanding for the visually impaired based on a modular computer vision sonification model

Banf, Michael

Citation link: https://nbn-resolving.org/urn:nbn:de:hbz:467-7716

DC Field	Value	Language
dc.contributor.author	Banf, Michael	-
dc.date.accessioned	2019-09-02T10:00:51Z	-
dc.date.available	2013-11-26T12:12:12Z	-
dc.date.available	2019-09-02T10:00:51Z	-
dc.date.issued	2013	-
dc.description.abstract	Die vorliegende Arbeit beschreibt ein System das blinden Menschen einen direkt erfahrbaren Zugang zu Bildern mit Hilfe akustischer Signale anbietet. Der Benutzer exploriert ein Bild interaktiv auf einem berührungsempfindlichen Bildschirm und erhält eine akustische Rückmeldung über den Bildinhalt an der jeweiligen Fingerposition. Die Gestaltung eines solchen Systems beinhaltet zwei größere Herausforderungen: Welche ist die relevante Bildinformation, und wie kann möglichst viel Information in einem Audiosignal untergebracht werden. Wir behandeln diese Probleme basierend auf einem modularen Computer Vision Sonikations Modell, welches wir als grundlegendes Gerüst für die Aufnahme, Exploration und Sonikation von visueller Information zur Unterstützung blinder Menschen vorstellen. Es werden einige Ansätze vorgestellt, welche hierzu die Information auf verschiedenen Abstraktionsebenen kombinieren. So z.B. sehr grundlegende Information wie Farbe, Kanten und Rauigkeit und komplexere Information welche durch die Verwendung von Machine Learning Algorithmen gewonnen werden kann. Diese Machine Learning Algorithmen behandeln sowohl das Erkennen von Objekten als auch die Klassikation von Bildregionen in "künstlich" und "natürlich", basierend auf einem neu entwickelten Typs eines probabilistischen graphischen Modells. Wir zeigen, dass dieser Mehr-Ebenen Ansatz dem Benutzer direkten Zugang zum Wesen und Position von Objekten und Strukturen im Bild ermöglicht und gleichzeitig das Potential neuester Entwicklungen im Bereich Computer Vision und Machine Learning ausnutzt. Während der Exploration kann der Benutzer erkannte "künstliche" Strukturen oder bestimmte natürliche Regionen als Referenzpunkte verwenden um andere natürliche Regionen mit Hilfe deren individueller Position, Farbe und Texturen zu klassizieren. Wir werden zeigen, dass geburtsblinde Teilnehmer diese Strategie erfolgreich einsetzen um ganze Szenen zu interpretieren und zu verstehen.	de
dc.description.abstract	This thesis presents a system that strives to give visually impaired people direct perceptual access to images via an acoustic signal. The user explores the image actively on a touch screen or touch pad and receives auditory feedback about the image content at the current position. The design of such a system involves two major challenges: what is the most useful and relevant image information, and how can as much information as possible be captured in an audio signal. We address those problems, based on a Modular Computer Vision Sonication Model, which we propose as a general framework for acquisition, exploration and sonication of visual information to support visually impaired people. General approaches are presented that combine low-level information, such as color, edges, and roughness, with mid- and high-level information obtained from Machine Learning algorithms. This includes object recognition and the classication of regions into the categories "man-made" versus "natural" based on a novel type of discriminative graphical model. We argue that this multi-level approach gives users direct access to the identity and location of objects and structures in the image, yet it still exploits the potential of recent developments in Computer Vision and Machine Learning. During exploration, the user can utilize detected man made structures or specic natural regions as reference points to classify other natural regions by their individual location, color and texture. We show that congenital blind participants employ that strategy successfully to interpret and understand whole scenes.	en
dc.identifier.uri	https://dspace.ub.uni-siegen.de/handle/ubsi/771	-
dc.identifier.urn	urn:nbn:de:hbz:467-7716	-
dc.language.iso	en	en
dc.rights.uri	https://dspace.ub.uni-siegen.de/static/license.txt	de
dc.subject.ddc	004 Informatik	de
dc.subject.other	Computer Vision	de
dc.subject.other	Sonifikation	de
dc.subject.other	Computer Mensch Interaktion	de
dc.subject.other	Assistive Systeme	de
dc.title	Auditory image understanding for the visually impaired based on a modular computer vision sonification model	en
dc.title	Akustisches Bildverständnis für Sehbehinderte basierend auf einem modularen Computer Visions Sonifikations Modell	de
dc.type	Doctoral Thesis	de
item.fulltext	With Fulltext	-
ubsi.date.accepted	2013-10-15	-
ubsi.publication.affiliation	Institut für Bildinformatik	de
ubsi.subject.ghbs	TVVC	-
ubsi.subject.ghbs	TWK	-
ubsi.type.version	publishedVersion	de
Appears in Collections:	Hochschulschriften

Files in This Item:

File	Description	Size	Format
banf.pdf		44.57 MB	Adobe PDF	View/Open

This item is protected by original copyright

View License

Show simple item record

Page view(s)

827

checked on Nov 29, 2024

Download(s)

511

checked on Nov 29, 2024

Google Scholar^TM

Check

Opus Siegen

Files in This Item:

Page view(s)

Download(s)

Google ScholarTM

Google Scholar^TM