Hydria: An Online Data Lake for Multi-Faceted Analytics in the Cultural Heritage Domain
Abstract
:1. Introduction
- (i)
- We put forward Hydria, an online, free, zero-administration data lake that offers both fundamental and advanced user and data/knowledge management functionality for big cultural data management. To the best of our knowledge, this is the first system that focuses on collecting, managing, analyzing, and sharing diverse, multi-faceted data in the cultural heritage domain and allows users without an IT background to deploy, populate, and manage their own data stores within minutes, alleviating the need to rely on expensive custom-made solutions that require IT infrastructure and skills to maintain.
- (ii)
- We present the architectural solutions behind the proposed system, discuss the individual module technologies and provide details on the module orchestration. We also describe several novel services that include automated data harvesting from the web and social media, integrated user input collection via standard and customizable data types, easy to perform data analysis and visualization, publish/subscribe functionality to facilitate sharing of different facets and data shards, and access control mechanisms.
- (iii)
- We advocate the appropriateness of our approach for the cultural heritage domain and showcase different scenarios that highlight its usefulness for cultural data management.
2. Related Work
2.1. Social Data Management in the Cultural Heritage Domain
2.2. Information Systems for Cultural Heritage
2.3. Information Systems for Museums
3. System Architecture
3.1. The Data Acquisition Module
- (i)
- The Data Harvesting submodule, which allows Hydria users to setup and deploy automated data collection crawlers and web scrapers (spiders) that are able to navigate the web and popular social media platforms, discover and harvest content of interest, and store the harvested data to the Hydria data lake. The Data Harvesting submodule is discussed in more detail in Section 3.1.1.
- (ii)
- The Structured Data Input submodule, which allows Hydria users to import whole datasets into Hydria and collect user input data by exploiting several built-in and customizable data entry forms. The Structured Data Input submodule is elaborated on in Section 3.1.2.
3.1.1. The Data Harvesting Submodule
- The engine receives a scrape request, which is a custom-made class that is used to parse responses and extract scraped data, and pushes it to the scheduler for later use; then the engine expects to read scheduled tasks from the Scheduler, in order to process them further.
- The scheduler performs scheduling on the available requests and returns to the engine the next request to be further processed (downloaded). Then, the engine forwards the request to the downloader through an appropriate message broker. When the page download is completed, the downloader creates a response and forwards it back to the engine via the message broker.
- Once the engine gets the response, it moves it to the spider for processing through the message broker. When the spider processing is over, the scraped data are returned and new requests are sent to the engine via the message broker.
- The engine initially passes the scraped data to the item pipeline, which is the software that is in charge of processing the data after they have been extracted by the spiders, then dispatches the processed request to the scheduler and subsequently awaits the next request to scrape.
- The above steps are performed iteratively, until no more scheduled requests are available. Note that all the extracted items are temporarily pushed in a persistent local data store (one per initiated scraper).
3.1.2. The Structured Data Input Submodule
3.2. The Data Management Module
- (i)
- Hydria allows users to share data pond templates by supporting the reuse of all or a part of data pond fields (e.g., demographic data in questionnaires) across different data ponds. To promote this functionality, the data pond creation service prompts the user to consider reusing one of the available data pond templates before creating a new data pond.
- (ii)
- Hydria provides users with the ability to dynamically create, store and edit drop-down lists of elements. To do so, the user specifies a unique name for the drop-down list and enters the list elements. Subsequently, when defining a multiple choice field (i.e., attribute), the user needs to set the data type to multiple choice and either select one of the stored drop-down lists from the pop-up window or dynamically create a new one that is thereafter stored along with the other drop-down lists for further (re)use.
3.3. The Data Analysis Module
3.4. The Publish/Subscribe Module
- A user may use the data pond search functionality to look for data ponds stored within the Hydria data lake that satisfy a given keyword query; after selecting one or more data ponds that are included in the result, she may send a subscription request to the owner of the specific data pond(s) asking for permission to access the data pond’s schema definition, i.e., the list of attributes of the data pond, their descriptions and data types. If the owner of the data pond accepts the subscription request, the user is eligible to access the data pond’s schema definition.
- After examining the data pond schema, the user may decide to request access to specific attributes of the data pond at record level. In this case, she may select one or more attributes of the targeted data pond and send a follow-up subscription request to the owner of the data pond. Once the owner receives the new request, she is able to either deny, accept the request as is, or remove any of the attributes that should not be shared at record level, and confirm the sharing of the remaining ones.
3.5. The User Management Module
- (i)
- System administrators, who are able to access, create, edit, preview, delete and filter all the data ponds and records that are stored in the data lake; they are also capable of managing all types of Hydria users, as well as the user-role assignments.
- (ii)
- Power users, who are typically curators in charge of one or more data ponds in the Hydria ecosystem. They are typically able to create new data ponds, initiate the collection of data from different online sources such as social media, the web, existing datasets, or end-users via questionnaires and surveys. They have access privileges in the data ponds that they create and in the records within these data ponds; they may analyze, filter and visualize the stored data, and collaborate with other power users in Hydria in the context of data sharing. A power user may also request to link specific end-users to new or existing data ponds.
- (iii)
- End-users, who may participate in surveys and questionnaires issued by Hydria power users and view/edit their own data; an end-user is neither allowed to create new data ponds, nor to view data of other end-users within the same data pond. They may, however, use the analysis tools to perform limited analysis tasks on their own contributed data.
3.6. Implementation Aspects
4. The TripMentor Case Study
4.1. Data Harvesting
- (i)
- The page URL was matched against the regular expression patterns .*/attraction/.* and .*/attica/.* (note that in Hydria regex expressions are case insensitive), using the url_regex classifier type, since the URLs of the PoIs in TripAdvisor start with the word attraction and contain the name of the region (Attica, in our case).
- (ii)
- The page body was matched against the regular expression pattern .*/greece/.* using the body_regex classifier type, to ensure that the word Attica found in the URL actually refers to the Greek region (and not e.g., to the Attica city in the State of NY).
4.1.1. Facebook Spiders
4.1.2. TripAdvisor Spiders
4.2. Importing Datasets and Adding/Modifying Records
4.3. Reusing Data Ponds and Data Pond Templates
4.4. Visualizing Information
5. Indicative Application Scenarios for Hydria
5.1. Hydria for Curators
5.2. Hydria for Researchers
5.3. Hydria for Data Scientists
5.4. Hydria for End Users
6. Conclusions and Outlook
Author Contributions
Funding
Conflicts of Interest
References
- Kenteris, M.; Vafopoulos, M.N.; Gavalas, D. Cultural Informatics in Web Science: A Case of Exploiting Local Cultural Content. In Proceedings of the 12th Pan-Hellenic Conference on Informatics, Samos Island, Greece, 28–30 August 2008. [Google Scholar]
- Salvatore, C.L. (Ed.) Cultural Heritage Care and Management: Theory and Practice; Rowman & Littlefield: London, UK, 2018. [Google Scholar]
- Antoniou, A.; Katifori, A.; Roussou, M.; Vayanou, M.; Karvounis, M.; Kyriakidi, M.; Pujol-Tost, L. Capturing the Visitor Profile for a Personalized Mobile Museum Experience: An Indirect Approach. In Proceedings of the 24th ACM Conference on User Modeling, Adaptation and Personalisation (UMAP 2016), Halifax, NS, Canada, 13–17 July 2016. [Google Scholar]
- Deladiennee, L.; Naudet, Y. A graph-based semantic recommender system for a reflective and personalised museum visit. In Proceedings of the 12th International Workshop on Semantic and Social Media Adaptation and Personalization (SMAP), Bratislava, Slovakia, 9–10 July 2017. [Google Scholar]
- Bourlakos, I.; Wallace, M.; Antoniou, A.; Vassilakis, C.; Lepouras, G.; Karapanagiotou, A.V. Formalization and Visualization of the Narrative for Museum Guides. In Proceedings of the Semantic Keyword-Based Search on Structured Data Sources Conference (IKC), Gdansk, Poland, 11–12 September 2018; Springer: Cham, Switzerland, 2018; pp. 3–13. [Google Scholar]
- Vassilakis, C.; Poulopoulos, V.; Antoniou, A.; Wallace, M.; Lepouras, G.; Nores, M.L. exhiSTORY: Smart exhibits that tell their own stories. In Future Generation Computer Systems; Elsevier: Amsterdam, The Netherlands, 2018; Volume 81, pp. 542–556. [Google Scholar]
- Meghini, C.; Bartalesi, V.; Metilli, D.; Benedetti, F. A Software Architecture for Narratives. In Proceedings of theItalian Research Conference on Digital Libraries, Udine, Italy, 25–26 January 2018. [Google Scholar]
- Bampatzia, S.; Bravo-Quezada, O.G.; Antoniou, A.; Nores, M.L.; Wallace, M.; Lepouras, G.; Vassilakis, C. The Use of Semantics in the CrossCult H2020 Project. In Proceedings of the Semantic Keyword-Based Search on Structured Data Sources Conference (IKC), Cluj-Napoca, Romania, 8–9 September 2016; Springer: Cham, Switzerland, 2016; Volume 10151, pp. 190–195. [Google Scholar]
- Kyvernitou, I.; Bikakis, A. An Ontology for Gendered Content Representation of Cultural Heritage Artefacts. Digit. Humanit. Q. 2017, 11. Available online: https://discovery.ucl.ac.uk/id/eprint/10041951/ (accessed on 23 April 2020).
- Vlachidis, A.; Bikakis, A.; Kyriaki-Manessi, D.; Triantafyllou, I.; Antoniou, A. The CrossCult Knowledge Base: A Co-inhabitant of Cultural Heritage Ontology and Vocabulary Classification. In Proceedings of the European Conference on Advances in Databases and Information Systems, Nicosia, Cyprus, 24–27 September 2017. [Google Scholar]
- Bartalesi, V.; Meghini, C. Using an ontology for representing the knowledge on literary texts: The Dante Alighieri case study. Semant. Web 2017, 8, 385–394. [Google Scholar] [CrossRef] [Green Version]
- Antoniou, A.; Lepouras, G. Modeling visitors’ profiles: A study to investigate adaptation aspects for museum learning technologies. J. Comput. Cult. Herit. (JOCCH) 2010, 3, 1–19. [Google Scholar] [CrossRef]
- Martin, J.; Trummer, C. Personalized Multimedia Information System for Museums and Exhibitions. In Lecture Notes in Computer Science, Proccedings of the 1st International Conference on Intelligent Technologies for Interactive Entertainment (INTETAIN), Madonna di Campiglio, Italy, 30 November-2 December 2005; Springer: Cham, Switzerland, 2005; Volume 3814, pp. 332–335. [Google Scholar]
- Rowe, J.P.; Lobene, E.V.; Mott, B.W.; Lester, J.C. Serious Games Go Informal: A Museum-Centric Perspective on Intelligent Game-Based Learning. In Lecture Notes in Computer Science, Proceedings of the 12th International Conference on Intelligent Tutoring Systems (ITS), Honolulu, HI, USA, 5–9 June 2014; Springer: Cham, Switzerland, 2014; Volume 8474, pp. 410–415. [Google Scholar]
- Tavcar, A.; Antonya, C.; Butila, E. Recommender System for Virtual Assistant Supported Museum Tours. Inform. (Slovenia) 2016, 40, 279–284. [Google Scholar]
- Vassilakis, C.; Antoniou, A.; Lepouras, G.; Poulopoulos, V.; Wallace, M.; Bampatzia, S.; Bourlakos, I. Stimulation of reflection and discussion in museum visits through the use of social media. Soc. Netw. Anal. Min. 2017, 7, 40. [Google Scholar] [CrossRef] [Green Version]
- Bampatzia, S.; Antoniou, A.; Lepouras, G.; Vassilakis, C.; Wallace, M. Using social media to stimulate history reflection in cultural heritage. In Proceedings of the 11th International Workshop on Semantic and Social Media Adaptation and Personalization (SMAP), Thessaloniki, Greece, 20–21 October 2016; pp. 89–92. [Google Scholar]
- Fontanella, F.; Molinara, M.; Gallozzi, A.; Cigola, M.; Senatore, L.J.; Florio, R.; Clini, P.; D’Amico, F.C. HeritageGO (HeGO): A Social Media Based Project for Cultural Heritage Valorization. In Proceedings of the 27th Conference on User Modeling, Adaptation and Personalization (UMAP), Larnaca, Cyprus, 4–17 July 2019; pp. 377–382. [Google Scholar]
- Nguyen, T.T.; Camacho, D.; Jung, J.E. Identifying and ranking cultural heritage resources on geotagged social media for smart cultural tourism services. Pers. Ubiquitous Comput. 2017, 21, 267–279. [Google Scholar] [CrossRef]
- Monti, L.; Delnevo, G.; Mirri, S.; Salomoni, P.; Callegati, F. Digital Invasions Within Cultural Heritage: Social Media and Crowdsourcing. In Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering, Proceedings of the 3rd International Conference on Smart Objects and Technologies for Social Good (GOODTECHS), Pisa, Italy, 29–30 November 2017; Springer: Cham, Switzerland, 2017; Volume 233, pp. 102–111. [Google Scholar]
- Nguyen, T.T.; Hwang, D.; Jung, J.J. Using Geotagged Resources on Social Media for Cultural Tourism: A Case Study on Cultural Heritage Tourism. In Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering, Proceedings of the 7th International Conference on Big Data Technologies and Applications (BDTA), Seoul, Korea, 17–18 November 2016; Springer: Cham, Switzerland, 2016; Volume 194, pp. 64–72. [Google Scholar]
- Liew, C.L. Participatory Cultural Heritage: A Tale of Two Institutions’ Use of Social Media. D-Lib Mag. 2014, 20. Available online: http://www.dlib.org/dlib/march14/liew/03liew.html (accessed on 23 April 2020). [CrossRef]
- Jensen, B. Instagram as cultural heritage: User participation, historical documentation, and curating in Museums and archives through social media. In Proceedings of the Digital Heritage International Congress, Marseille, France, 28 October–1 November 2013; pp. 311–314. [Google Scholar]
- 7th International Euro-Mediterranean Conference (EuroMed), LNCS, Nicosia, Cyprus, 29 October–3 November 2018; Springer: Cham, Switzerland. 2018. Available online: https://wbc-rti.info/object/event/17918 (accessed on 23 April 2020).
- 3rd Joint SIGHUM Workshop on Computational Linguistics for Cultural Heritage, Social Sciences, Humanities and Literature (LaTeCH), ACL, Minneapolis, MN, USA, June 2019. Available online: https://www.aclweb.org/anthology/volumes/W19-25/ (accessed on 23 April 2020).
- 10th International Workshop on Human-Computer Interaction, Tourism and Cultural Heritage (HCITOCH), LNCS, Florence, Italy, 5–7 September 2019; Springer: Cham, Switzerland, 2019.
- Communications in Computer and Information Science, 1st International Conference on VR Technologies in Cultural Heritage (VRTCH), Brasov, Romania, 29–30 May 2018; Springer: Cham, Switzerland, 2018; Volume 904, Available online: http://library.oapen.org/handle/20.500.12657/23304 (accessed on 23 April 2020).
- Bai, D.; Messinger, D.W.; Howell, D. A pigment analysis tool for hyperspectral images of cultural heritage artifacts. In Proceedings of the Algorithms and Technologies for Multispectral, Hyperspectral, and Ultraspectral Imagery XXIII, Society of Photo-Optical Instrumentation Engineers (SPIE) Conference Series, Anaheim, CA, USA, 9–13 April 2017; Volume 10198. [Google Scholar]
- Hoonjong; Stoykova, E.; Berberova, N.; Park, J.; Nazarova, D.; Park, J.S.; Kim, Y.; Hong, S.; Ivanov, B.; Malinowski, N. Three-dimensional imaging of cultural heritage artifacts with holographic printers. In Proceedings of the 19th International Conference and School on Quantum Electronics: Laser Physics and Applications (ICSQE), Sozopol, Bulgaria, 26–30 September 2016; Volume 10226. [Google Scholar]
- Themistocleous, K. Debate and Considerations on Using Videos for Cultural Heritage from Social Media for 3D Modelling. In Proceedings of the 6th International Conference on Progress in Cultural Heritage: Documentation, Preservation, and Protection (EuroMed), Nicosia, Cyprus, 31 October–5 November 2016; Volume 10058, pp. 513–520. [Google Scholar]
- Torres, J.C.; López, L.; Romo, C.; Arroyo, G.; Cano, P.; Lamolda, F.; del Mar Villafranca, M. Using a Cultural Heritage Information System for the documentation of the restoration process. In Proceedings of the Digital Heritage International Congress, Marseille, France, 28 October–1 November 2013; pp. 249–256. [Google Scholar]
- Nurminen, M.; Heimburger, A. Representation and Retrieval of Uncertain Temporal Information in Museum Databases. In Proceedings of the 21st European—Japanese Conference on Information Modelling and Knowledge Bases (EJC), Tallinn, Estonia, 6–10 June 2011; Volume 1. Available online: http://ebooks.iospress.nl/publication/6781 (accessed on 23 April 2020).
- Chias, P.; Abad, T. Visualising Ancient Maps as Cultural Heritage: A Relational Database of the Spanish Ancient Cartography. In Proceedings of the 12th International Conference on Information Visualisation, London, UK, 9–11 July 2008; Volume 1. [Google Scholar]
- Meyer, E.; Grussenmeyer, P.; Perrin, J.P.; Durand, A.; Drap, P. A web information system for the management and the dissemination of Cultural Heritage data. J. Cult. Herit. 2007, 8, 396–411. [Google Scholar] [CrossRef] [Green Version]
- Jancsó, A.L.; Jonlet, B.; Hoffsummer, P.; Delye, E.; Billen, R. An Analytical Framework for Classifying Software Tools and Systems Dealing with Cultural Heritage Spatio-Temporal Information. In Lecture Notes in Geoinformation and Cartography, Proceedings of Workshops and Posters at the 13th International Conference on Spatial Information Theory (COSIT), L’Aquila, Italy, 4–8 September 2017; Springer: Cham, Switzerland, 2017; pp. 325–337. [Google Scholar]
- Colace, F.; Santo, M.D.; Greco, L.; Chianese, A.; Moscato, V.; Picariello, A. CHIS: Cultural Heritage Information System. IJKSR 2013, 4, 18–26. [Google Scholar] [CrossRef] [Green Version]
- Chinnov, A.; Kerschke, P.; Meske, C.; Stieglitz, S.; Trautmann, H. An Overview of Topic Discovery in Twitter Communication through Social Media Analytics; AMCIS: Morristown, NJ, USA, 2015. [Google Scholar]
- Korzun, D.G. Designing Smart Space Based Information Systems: The Case Study of Services for IoT-Enabled Collaborative Work and Cultural Heritage Environments. In Frontiers in Artificial Intelligence and Applications, Proceedings of the 12th International Baltic Conference on Databases and Information Systems (DB&IS), Riga, Latvia, 4–6 July 2016; Arnicans, G., Arnicane, V., Borzovs, J., Niedrite, L., Eds.; IOS Press: Clifton, VA, USA, 2016; Volume 291, pp. 183–196. [Google Scholar]
- Poulopoulos, V.; Vassilakis, C.; Antoniou, A.; Wallace, M.; Lepouras, G.; Nores, M.L. ExhiSTORY: IoT in the service of Cultural Heritage. In Proceedings of the IEEE Global Information Infrastructure and Networking Symposium (GIIS), Thessaloniki, Greece, 23– 25 October 2018; pp. 1–4. [Google Scholar]
- Su, X.; Sperlì, G.; Moscato, V.; Picariello, A.; Esposito, C.; Choi, C. An Edge Intelligence Empowered Recommender System Enabling Cultural Heritage Applications. IEEE Trans. Ind. Inform. 2019, 15, 4266–4275. [Google Scholar] [CrossRef]
- Pandolfo, L.; Pulina, L.; Grosso, E. A User Model Ontology for Adaptive Systems in Cultural Tourism Domain. In Frontiers in Artificial Intelligence and Applications, Proceedings of the 1st International Conference on Applications of Intelligent Systems (APPIS), Las Palmas de Gran Canaria, Spain, 10–12 January 2018; Petkov, N., Strisciuglio, N., Travieso-González, C.M., Eds.; IOS Press: Clifton, VA, USA, 2018; Volume 310, pp. 212–219. [Google Scholar]
- Díaz-Corona, D.; Lacasta, J.; Latre, M.Á.; Zarazaga-Soria, F.J.; Nogueras-Iso, J. Profiling of knowledge organisation systems for the annotation of Linked Data cultural resources. Inf. Syst. 2019, 84, 17–28. [Google Scholar] [CrossRef]
- Larosiliere, G.D.; Carter, L.D.; Meske, C. How does the world connect? Exploring the global diffusion of social network sites. J. Assoc. Inform. Sci. Technol. (JASIST) 2017, 68, 1875–1885. [Google Scholar] [CrossRef]
- Miloslavskaya, N.; Tolstoy, A. Big Data, Fast Data and Data Lake Concepts. Procedia Comput. Sci. 2016, 88, 300–305. [Google Scholar] [CrossRef] [Green Version]
- TripMentor Project. Available online: https://www.researchgate.net/project/TripMentor (accessed on 1 April 2020).
- Chianese, A.; Marulli, F.; Piccialli, F. Cultural Heritage and Social Pulse: A Semantic Approach for CH Sensitivity Discovery in Social Media Data. In Proceedings of the 10th International Conference on Semantic Computing (ICSC), IEEE Computer Society, Laguna Hills, CA, USA, 4–6 February 2016; pp. 459–464. [Google Scholar]
- Moscato, V.; Picariello, A.; Subrahmanian, V.S. Multimedia Social Networks for Cultural Heritage Applications: The GIVAS Project. In Data Management in Pervasive Systems; Data-Centric Systems and Applications; Springer: Cham, Switzerland, 2015; pp. 169–182. [Google Scholar]
- Colace, F.; Santo, M.D.; Moscato, V.; Picariello, A.; Schreiber, F.A.; Tanca, L. PATCH: A Portable Context-Aware ATlas for Browsing Cultural Heritage. In Data Management in Pervasive Systems; Data-Centric Systems and Applications; Springer: Cham, Switzerland, 2015; pp. 345–361. [Google Scholar]
- Vodopivec, B.; Eppich, R.; Zarnic, R. Cultural Heritage Information Systems State of the Art and Perspectives. In Lecture Notes in Computer Science, Proceedings of the 5th International Conference on Progress in Cultural Heritaage: Documentation, Preservation, and Protection (EuroMed), Limassol, Cyprus, 3–8 November 2014; Springer: Cham, Switzerland, 2014; Volume 8740, pp. 146–155. [Google Scholar]
- Alkhafaji, A.S.A.; Fallahkhair, S. Smart Ambient: Development of Mobile Location Based System to Support Informal Learning in the Cultural Heritage Domain. In Proceedings of the 14th International Conference on Advanced Learning Technologies (ICALT), IEEE Computer Society, Athens, Greece, 7–10 July 2014; pp. 774–776. [Google Scholar]
- Cassatella, C.; Volpiano, M.; Seardo, B.M. Interpreting historic and cultural landscapes: Potentials and risks in Geographical Information Systems building for knowledge and management. In Proceedings of the Digital Heritage International Congress, IEEE, Marseille, France, 28 October–1 November 2013; pp. 107–110. [Google Scholar]
- Torres, J.C.; López, L.; Romo, C.; Soler, F. An Information System to Analize Cultural Heritage Information. In Lecture Notes in Computer Science, Proceedings of the 4th International Conference on Progress in Cultural Heritaage: Documentation, Preservation, and Protection (EuroMed), Limassol, Cyprus, 29 October–3 November 2012; Springer: Cham, Switzerland, 2012; Volume 7616, pp. 809–816. [Google Scholar]
- Ploszajski, G. Technical Metadata and Standards for Digitisation of Cultural Heritage in Poland. In New Trends in Multimedia and Network Information Systems; Frontiers in Artificial Intelligence and Applications; IOS Press: Amsterdam, The Netherlands, 2008; Volume 181, pp. 155–170. [Google Scholar]
- Smirnov, A.V.; Kashevnik, A.M.; Ponomarev, A. Context-based infomobility system for cultural heritage recommendation: Tourist Assistant—TAIS. Pers. Ubiquitous Comput. 2017, 21, 297–311. [Google Scholar] [CrossRef]
- Narumi, T.; Hayashi, O.; Kasada, K.; Yamazaki, M.; Tanikawa, T.; Hirose, M. Digital Diorama: AR Exhibition System to Convey Background Information for Museums. In Lecture Notes in Computer Science, Proceedings of the International Conference on Virtual and Mixed Reality—New Trends, Orlando, FL, USA, 9–4 July 2011; Shumaker, R., Ed.; Springer: Cham, Switzerland, 2011; Volume 6773, pp. 76–86. [Google Scholar]
- Gentile, A.; Andolina, S.; Massara, A.; Pirrone, D.; Russo, G.; Santangelo, A.; Trumello, E.; Sorce, S. A Multichannel Information System to Build and Deliver Rich User-Experiences in Exhibits and Museums. In Proceedings of the International Conference on Broadband, Wireless Computing, Communication and Applications (BWCCA), IEEE Computer Society, Barcelona, Spain, 26–28 October 2011; pp. 57–64. [Google Scholar]
- Chen, S.; Pan, Z.; Zhang, M. A Virtual Informal Learning System for Cultural Heritage. Trans. Edutainment 2012, 7, 180–187. [Google Scholar]
- Wu, S. Systems integration of heterogeneous cultural heritage information systems in museums: A case study of the National Palace Museum. Int. J. Digit. Libr. 2016, 17, 287–304. [Google Scholar] [CrossRef]
- Chen, C.; Chang, B.R.; Huang, P. Multimedia augmented reality information system for museum guidance. Pers. Ubiquitous Comput. 2014, 18, 315–322. [Google Scholar] [CrossRef]
- Chanhom, W.; Anutariya, C. TOMS: A Linked Open Data System for Collaboration and Distribution of Cultural Heritage Artifact Collections of National Museums in Thailand. New Gener. Comput. 2019, 37, 479–498. [Google Scholar] [CrossRef]
- Naudet, Y.; Antoniou, A.; Lykourentzou, I.; Tobias, E.; Rompa, J.; Lepouras, G. Museum personalization based on gaming and cognitive styles: The BLUE experiment. Int. J. Virtual Communities Soc. Netw. (IJVCSN) 2015, 7, 1–30. [Google Scholar] [CrossRef]
- Bampatzia, S.; Bourlakos, I.; Antoniou, A.; Vassilakis, C.; Lepouras, G.; Wallace, M. Serious games: Valuable tools for cultural heritage. In Proceedings of the International Conference on Games and Learning Alliance, Utrecht, The Netherlands, 5–7 December 2016; Springer: Cham, Switzerland, 2016; pp. 331–341. [Google Scholar]
- Dorter, G.; Davis, L. Bringing geographic information systems (GIS) into the museum world. In Proceedings of the Digital Heritage International Congress, IEEE, Marseille, France, 28 October–1 November 2013. [Google Scholar]
- Soler, F.; Torres, J.C.; León, A.J.; Luzón, M.V. Design of cultural heritage information systems based on information layers. JOCCH 2013, 6, 1–17. [Google Scholar] [CrossRef]
- Soler, F.; Torres, J.C.; León, A.J.; Luzón, M.V. Design of an Information System for Cultural Heritage. In Proceedings of the Spanish Computer Graphics Conference (CEIG), Eurographics Association, Jaén, Spain, 12–14 September 2012; pp. 113–122. [Google Scholar]
- Wikipedia The Free Encyclopedia. Available online: https://www.wikipedia.org/ (accessed on 1 April 2020).
- Europeana. Available online: https://www.europeana.eu/portal/en (accessed on 1 April 2020).
- DBLP: Computer Science Bibliography. Available online: https://dblp.org/ (accessed on 1 April 2020).
- Odysseus Ministry of Culture and Sports. Available online: http://odysseus.culture.gr/index_en.html (accessed on 1 April 2020).
- WikiCFP A wiki for Calls For Papers. Available online: http://www.wikicfp.com/cfp/ (accessed on 1 April 2020).
- Meske, C.; Junglas, I.A.; Schneider, J.; Jaakonmaki, R. How Social is Your Social Network? Toward A Measurement Model. In Proceedings of the 40th International Conference on Information Systems (ICIS), Munich, Germany, 15–18 December 2019. [Google Scholar]
- Stieglitz, S.; Meske, C.; Ross, B.; Mirbabaie, M. Going Back in Time to Predict the Future—The Complex Role of the Data Collection Period in Social Media Analytics. Inf. Syst. Fronti. 2018. [Google Scholar] [CrossRef]
- von der Putten, A.M.R.; Hastall, M.; Köcher, S.; Meske, C.; Heinrich, T.; Labrenz, F.; Ocklenburg, S. “Likes” as social rewards: Their role in online social comparison and decisions to like other People’s selfies. Comput. Hum. Behav. 2019, 92, 76–86. [Google Scholar] [CrossRef]
- Myers, D.; McGuffee, J.W. Choosing scrapy. J. Comput. Sci. Coll. 2015, 31, 83–89. [Google Scholar]
- Scrapy at a Glance. Available online: https://docs.scrapy.org/en/latest/intro/overview.html (accessed on 10 March 2020).
- Chaulagain, R.S.; Pandey, S.; Basnet, S.R.; Shakya, S. Cloud based web scraping for big data applications. In Proceedings of the IEEE International Conference on Smart Cloud (SmartCloud), New York, NY, USA, 3–5 November 2017; pp. 138–143. [Google Scholar]
- Santos, A.; Pham, K. GitHub—VIDA-NYU/ache. Available online: https://github.com/VIDA-NYU/ache (accessed on 1 April 2020).
- Barbosa, L.; Freire, J. An adaptive crawler for locating hidden-web entry points. In Proceedings of the 16th International Conference on World Wide Web (WWW), Banff, AL, Canada, 8–12 May 2007; pp. 441–450. [Google Scholar]
- Vieira, K.; da Silva, L.B.A.S.; Freire, J.; Moura, E. Finding seeds to bootstrap focused crawlers. World Wide Web (WWW) 2016, 19, 449–474. Available online: https://link.springer.com/article/10.1007/s11280-015-0331-7 (accessed on 23 April 2020). [CrossRef]
- Bonzanini, M. Mastering Social Media Mining with Python; Packt Publishing Ltd.: Birmingham, UK, 2016. [Google Scholar]
- Stauffer, M. Laravel: Up & Running: A Framework for Building Modern PHP Apps; O’Reilly Media: Sebastopol, CA, USA, 2019. [Google Scholar]
- TripAdvisor: Read Reviews, Compare Prices & Book. Available online: https://www.tripadvisor.com/ (accessed on 1 April 2020).
- Facebook. Available online: https://www.facebook.com/ (accessed on 1 April 2020).
- Twitter. Available online: https://twitter.com/ (accessed on 1 April 2020).
- DBpedia Homepage. Available online: https://wiki.dbpedia.org/ (accessed on 1 April 2020).
- Art & Architecture Thesaurus Online. Available online: https://www.getty.edu/research/tools/vocabularies/aat/index.html (accessed on 1 April 2020).
- Marketakis, Y.; Minadakis, N.; Kondylakis, H.; Konsolaki, K.; Samaritakis, G.; Theodoridou, M.; Flouris, G.; Doerr, M. X3ML mapping framework for information integration in cultural heritage and beyond. IJDL 2017, 18, 301–319. [Google Scholar] [CrossRef] [Green Version]
- Stavropoulos, T.G.; Kontopoulos, E.; Meroño-Peñuela, A.; Tachos, S.; Andreadis, S.; Kompatsiaris, Y. Cross-domain Semantic Drift Measurement in Ontologies Using the SemaDrift Tool and Metrics. In Proceedings of the MEPDaW & LDQ @ ESWC, Bologna, Italy, 29 May 2017. [Google Scholar]
- Initiative, G.F. FAIR Principles. 2019. Available online: https://www.go-fair.org/fair-principles/ (accessed on 6 April 2020).
- Bozzon, A.; Brambilla, M.; Ceri, S.; Silvestri, M.; Vesci, G. Choosing the Right Crowd: Expert Finding in Social Networks. In Proceedings of the 16th International Conference on Extending Database Technology (EDBT), Genoa, Italy, 18–22 March 2013; Association for Computing Machinery: New York, NY, USA, 2013; pp. 637–648. [Google Scholar] [CrossRef]
- Lin, S.; Hong, W.; Wang, D.; Li, T. A survey on expert finding techniques. J. Intell. Inf. Syst. 2017, 49, 255–279. [Google Scholar] [CrossRef]
- Nikzad–Khasmakhi, N.; Balafar, M.; Reza Feizi–Derakhshi, M. The state-of-the-art in expert recommendation systems. Eng. Appl. Artif. Intell. 2019, 82, 126–147. [Google Scholar] [CrossRef]
- Lykourentzou, I.; Khan, V.J.; Papangelis, K.; Markopoulos, P. Macrotask Crowdsourcing: An Integrated Definition. In Human—Computer Interaction Series; Lykourentzou, I., Khan, V.J., Papangelis, K., Markopoulos, P., Eds.; Springer International Publishing: Cham, Switzerland, 2019; pp. 1–13. [Google Scholar] [CrossRef]
- Schmitz, H.; Lykourentzou, I. Online Sequencing of Non-Decomposable Macrotasks in Expert Crowdsourcing. Trans. Soc. Comput. 2018, 1, 1–33. [Google Scholar] [CrossRef]
Data Ponds | Records |
---|---|
facebook_venues | 10,405 |
facebook_posts | 139,880 |
facebook_comments | 203,523 |
facebook_events | 150 |
tripadvisor_venues | 6869 |
tripadvisor_user_reviews | 298,769 |
Venue Category | # of PoIs |
---|---|
Arts and Entertainment | 2116 |
Breakfast and Brunch Restaurants | 114 |
Cafe | 2854 |
Hotels | 778 |
Landmarks | 614 |
Museums | 210 |
Parks and Outdoors | 712 |
Restaurants | 3007 |
© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
Share and Cite
Deligiannis, K.; Raftopoulou, P.; Tryfonopoulos, C.; Platis, N.; Vassilakis, C. Hydria: An Online Data Lake for Multi-Faceted Analytics in the Cultural Heritage Domain. Big Data Cogn. Comput. 2020, 4, 7. https://doi.org/10.3390/bdcc4020007
Deligiannis K, Raftopoulou P, Tryfonopoulos C, Platis N, Vassilakis C. Hydria: An Online Data Lake for Multi-Faceted Analytics in the Cultural Heritage Domain. Big Data and Cognitive Computing. 2020; 4(2):7. https://doi.org/10.3390/bdcc4020007
Chicago/Turabian StyleDeligiannis, Kimon, Paraskevi Raftopoulou, Christos Tryfonopoulos, Nikos Platis, and Costas Vassilakis. 2020. "Hydria: An Online Data Lake for Multi-Faceted Analytics in the Cultural Heritage Domain" Big Data and Cognitive Computing 4, no. 2: 7. https://doi.org/10.3390/bdcc4020007
APA StyleDeligiannis, K., Raftopoulou, P., Tryfonopoulos, C., Platis, N., & Vassilakis, C. (2020). Hydria: An Online Data Lake for Multi-Faceted Analytics in the Cultural Heritage Domain. Big Data and Cognitive Computing, 4(2), 7. https://doi.org/10.3390/bdcc4020007