samedi 10 janvier 2015 19h26

Parallel Processing of Public Open Data with the MapReduce Paradigm : A Case Study

Billel ARRES 1, * Omar Boussaid 1 Nadia KABACHI 1 Fadila Bentayeb 1

* Auteur correspondant

ERIC - Equipe de Recherche en Ingénierie des Connaissances

Abstract : Nowadays, many governments and states are involved in an opening strategy of their public data. However, the volume of these opened data is constantly increasing, and will reach in the near future limitations of current treatment and storage capacity. On the other hand, the MapReduce paradigm is one of the most used parallel programming models. With a master-slave architecture, it allows parallel processing of very large data sets. In this paper, we propose a parallel approach based on Mapreduce to process public open data. Applied, as a case study, to the official data sets from the French Ministry of Communication. We implement a parallel algorithm as a solution to define a ranking of national museums and galleries according to the accessibility degrees for people with disabilities. We studied the feasibility of our approach in two main parts: The performance in terms of execution time, and, the visualization of the obtained results in order to integrate them into solutions such as geographic BI. This work can be applied to other cases with very large data sets.

keyword : Big data Open Data Mapreduce Big data Open Data Mapreduce

Type de document :

Communication dans un congrès

Big Spatial Data, Jul 2014, Orléans, France. pp.132-141

Domaine :

Informatique / Calcul parallèle, distribué et partagé

Source :

https://hal.archives-ouvertes.fr/hal-01023308

Lien permanent 0 commentaire

D	L	M	M	J	V	S
			1	2	3	4
5	6	7	8	9	10	11
12	13	14	15	16	17	18
19	20	21	22	23	24	25
26	27	28	29	30	31

AUDENTIA

Informatique - Logiciels - Gestion - Formation - Tél : 09 50 31 52 80 - 06 62 23 52 80 - contact@audentia-gestion.fr

#MOOC Comptabilité

VOTRE IDEE. VOTRE SITE WEB. Sur tous les appareils sans HTML

#MOOC Informatique

ACHETER DIRECTEMENT VOS LOGICIELS DE GESTION

Lawyer'it

#MOOC sur les Bases de Données

Rooming'it

Parallel Processing of Public Open Data with the MapReduce Paradigm : A Case Study

Parallel Processing of Public Open Data with the MapReduce Paradigm : A Case Study