Digital Humanities

Many different scientific domains are using computer-based methods and approaches to verify hypothesis or to explore possible patterns in their datasets. This course will mainly focus on text-based datasets and machine learning methods to extract both content and stylistic patterns from texts (historical documents, newspaper articles, political speeches, tweets, etc.). These datasets are typical of humanities and social sciences. Different approaches to discover the evolution over time, or the differences between authors, genders, author’s ages and their psychological profiles will be discussed.

In addition, the course will provide an introduction to basic network concepts such as density, centrality, or clustering and communities detection. Different network types will be presented to be able to evaluate hypothesis or to simulate various contagion models (e.g., epidemic, information, or fake news spreading). Applications related to the Web will be discussed.


Code 32108
Type Course
Site Neuchâtel
Track(s) T3 – Advanced Information Processing
Semester A2022


Learning Outcomes

Learning outcomes The main objectives of this course is to introduce the students to the various techniques and strategies that can be used to

  1. to store, convert, and correct texts to generate a corpus
  2. to extract useful patterns from a corpus
  3. to compute the intertextual similarities between texts or corpora (clustering)
  4. to verify the authorship of a document or to draw the profile of the true author
  5. to apply a fair evaluation of those text categorization methods
  6. to understand network concepts (density, centrality, communities detection)
  7. to design, generate and analyze simulations based on network (e.g., contagion, information diffusion)
Lecturer(s) Jacques Savoy
Language english
Course Page

The course page in ILIAS can be found at

Schedules and Rooms

Period Weekly
Schedule Wednesday, 08:45 - 12:00
Location UniNE, Unimail
Room B013


Evaluation type written exam

Additional information


First Lecture
The first lecture will take place on Wednesday, 21.09.2022 at 08:45 in UniNE, Unimail, room B013.


  • Karsdorp, F., Kestemont, M., Riddell, A. (20921). Humanities Data Analysis. Case Studies with Python. Princeton University Press: Princeton.
  • Savoy, J. (2020). Machine Learning Methods for Stylometry: Authorship Attribution and Author Profiling. Springer: Cham.