This paper proposes a cross-cultural computing system that deals with multilingual analysis. This system focuses on a cultural aspect comparison that is based on linguistic basic elements. The most important task of our system is to realize a cross-cultural computation in the framework of correlation computation by using vectorized numeric data that express cultural aspects in some concepts and objects with regard to speech sounds.
The key technology of the system is a cross-cultural semantic distance computation in phonological-semantic metadata spaces that involve the phonological aspects of sound, syllabic and lexical composition features. The phonological-semantic metadata of multiple languages is extracted based on two main aspects of language: form and meaning. Form refers to speech sound, and meaning refers to the semantic of language.
We compare language units (or terms) with the same meaning from different cultures, focusing on the speech sound characteristics of the terms. The speech sound metadata are extracted from a term and separated based on the phonological aspects of sound, syllabic and lexical composition features. These metadata are converted into vectorized numeric data to create phonological-semantic vector spaces. By using these spaces, we conducted similarity and weighting computations to perform a comparative analysis of language-related metadata.
Our research goal is to perform a language similarity analysis through a term-based distance calculation in phone (sound) and meaning spaces, and to reconstruct an inheritance relationship among languages via agglomerative hierarchical clustering based on an inter-term distance calculation.
Our system clusters the phonological-semantic vector space and represents a 2D visualization of cultural differentiation to analyze further the interconnectedness across languages. In this paper, we perform our proposed cross-cultural computing system for an experimental purpose with linguistic data from 32 different Asian-Oceanic languages.