CFPapers: Social Sciences and Humanities (SSH) tackle the Big Data Challenge

Workshop at EUDAT Conference

28. October 2013


Workshop Background

Social Sciences and Humanities (SSH) are in general not known for Big Data challenges since the term often is reduced to the processing of large data volumes. However, there are more aspects – such as data to be worked on being highly distributed or having complex relationships with each other that need to be exploited. Apart from the heterogeneity and the highly structured nature of much of the data relevant for SSH research, RIs in the SSH have to tackle licensing and other IPR issues. This raises the question of how services and data are best brought together, especially in a distributed network of data and service providers. The resulting issues need to be solved for Research Infrastructures and Virtual Research Environments especially when very large data sets are involved.

Also in the Social Sciences and Humanities researchers that are for example tackling the Grand Challenges are confronted with huge amounts and an increasing complexity of data: Experimentalists are engaging thousands of informants via mobile devices which will in a very few years amount to 1 TB of data per day with complex internal structure. Researchers who want to understand how our human brain is dealing with the increasingly complex environment and complex structures not only in language work on huge amounts of data from brain-imaging and genetic sequencing amongst others. Social scientists looking at various aspects of our instable societies want to understand the minds of people, their complex inter-cultural relations and the dynamics of complex societal systems collecting also vast amounts of data covering complex relationships. More examples could be mentioned. Most of these studies are being carried out in highly interdisciplinary settings.

Obviously these new paradigms and challenges require completely new strategies in SSH with respect to managing, processing and sharing such data. Yet we cannot claim that SSH is prepared for these challenges with respect to data organization, the required infrastructures and the analytics that are required at different layers (from hardware to software algorithms). Yet we even do not know what kind of different activities are being thought of and undertaken to tackle that sort of challenges and what kind of facilities are actually being used.

Workshop Goals

The workshop can be seen as an event to a) identify and discuss such new paradigms which we could describe with the terms “Big Data Challenges in SSH” and b) discuss requirements for the infrastructural embedding, for the required analytics and for the skills that are required. The goal of the workshop is to derive a few key messages that could be turned into a program under the EC’s H2020 framework program and perhaps national funding programs.

Targeted Audience and Format

We would like to motivate SSH researchers and/or technologists working in SSH who have new ideas in mind or started projects that can be seen as tackling “Big Data Challenges in SSH” to report on their work or to sketch the ideas and elaborate as well on the embedding to be able to carry out such work. The scope of the issues in data volume and complexity and analytics should certainly be beyond what has been known in the field for years. We also would like to motivate infrastructure builders to join the workshop to discuss the infrastructural needs.

We solicit to submit abstracts of maximally one page where the following topics are addressed:

the new scientific challenge and why it could be called “Big Data Challenge in SSH”
the embedding and facilities that are required to meet the challenges
the help from infrastructures that should be available given the expectation that the amounts and complexity of data are rapidly growing in the coming years

The plan is to first give every presenter a slot to describe the highlights of his/her idea/project, i.e. the length of the presentations will depend on the number of submissions that meet the criteria. A second part of this one day workshop will then be devoted to discuss the needs of the presented papers in detail. In this part we will make use of the availability of the many technology experts that will be at this conference.


Please, send your abstract (maximally one page, PDF) to the following address:

Deadline for submission is:                Sunday, September 15th, 2013; 22.00


People who want to stay for the whole conference need to pay 150 € which will then include lunch, coffee etc. People who just want to attend the workshop will pay 30 €.

Organizers (underlined) and PC

Peter Wittenburg, MPI for Psycholinguistics,
Sebastian Drude, MPI for Psycholinguistics,
Steven Krauwer, U Utrecht,
Erhard Hinrichs, U Tübingen,
Tobias Blanke, King’s College,
Mark Hedges , King’s College,
Heike Neuroth, SUB Göttingen,
Hans Jørgen Marker, SND Gothenburg,
Markus Quandt, Gesis Cologne,
Alexia Katsanidou, Gesis Cologne,
Herman Stehouwer, MPI for Psycholinguistics,