Thu

27.07.2023

IACT'23 - Taipei, Taiwan

The 1st International Workshop on Implicit Author Characterization from Texts for Search and Retrieval (IACT’23) held in conjunction with the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval

Call for papers

To bring the attention of the research community to the limitations of current models at recognizing and characterizing AI vs. human authors, we propose to organize the first edition of IACT workshops under the umbrella of the SIGIR conference. Research works submitted to the workshop should foster the scientific advance on all aspects of author characterization, including but not limited to the following:

Differentiation between AI-generated content and human-generated content and bot profiling
Characterization of conversational agents
Feature detection of authors for human vs. AI determination
Prompt understanding and recognition in language models
Personalized question answering and conversation generation
Troll identification on social media
Review authenticity estimation
Multi-modal, multi-genre, and multilingual author analysis
Character analysis, description, and representation in narrative texts
Detecting implicit expressions of sentiment, emotion, opinion, and bias
Transfer learning for implicit author characterization
Implicit author characterization annotation schema
Evaluation of implicit author characterization
Author characterization in low-resource languages and under-studied domains
Resources and dataset showcase
Accountability and regulation of AI-based information extraction, retrieval, and content generation
Copyright issues of AI-generated content
Ethical and privacy implications of author characterization and implicit information extraction
Fairness and bias of AI-generated content

Important Dates

February 15, 2022: CFP
Extended to May 23, 2023 (final): Submission deadline
June 13, 2023: Review due
June 20, 2023: Acceptance Notification Date
June 27, 2023: Camera-ready copies
July 20, 2023: Proceedings online at CEUR
July 27, 2023: Workshop

Submissions

All papers must be original and not simultaneously submitted to another journal or conference. The following paper categories are welcome:

Full research papers: up to 8 pages. Original and high-quality unpublished contributions to the theory and practical aspects of the workshop topics.
Short research papers: up to 5 pages. It can describe ongoing research, resources, and demos.
Negative result papers: up to 5 pages. Highlighting tested hypotheses that did not get the expected outcome is also welcomed.
Position papers: up to 5 pages. Discussing current and future research directions.

The submissions must be anonymous and will be peer-reviewed by at least two program committee members.

The authors of accepted papers will be given 15 minutes for a short oral presentation. The workshop will run as a hybrid event to allow virtual attendance and meet the SIGIR format.

Papers must be submitted electronically in PDF format through Easy Chair. All submissions must be in English and formatted according to the one-column CEUR-ART style with no page numbers. Templates in Word or LaTeX can be found in the following zip folder at https://ceur-ws.org/Vol-XXX/CEURART.zip. There is also an Overleaf page for LaTeX users.

Shared Task: ML/NLP Competition on Automatic Classification of Literary Epochs (CoLiE)

To advance the field of implicit temporal information retrieval from a text, this competition aims to challenge participants to develop automatic methods to identify the literary epochs of a given text, which is considered here as an implicit temporal context of a book. The task on Automatic Classification of Literary Epochs (CoLiE) aims at automatic identification of the literary epoch of a given text from its writing style: (1) Romanticism (1798-1837), (2) Victorian Literature (1837-1901), (3) Modernism (1900-1945), (4) Postmodernism (1945-2000), and (5) our days (from 2000).

This competition is open to anyone with a passion for information retrieval, machine learning, and natural language processing. Whether you are a seasoned expert or a newcomer to the field, we welcome you to participate and extend the boundaries of automated text analysis!

Competition site: http://www.kaggle.com/competitions/colie

Competition Timeline

May 28, 2023: The competition is open to participants. Training and validation sets together with their labels are available.
July 10, 2023: The test dataset is available.
July 17, 2023, 23:59 UTC: Final submission deadline.
July 27, 2023: The winners are announced at the special session at the IACT'23 workshop.

Organization

Organizing Committee

Marina Litvak - marinal@ac.sce.ac.il; Shamoon College of Engineering Beer Sheva; Israel
Irina Rabaev - irinar@ac.sce.ac.il; Shamoon College of Engineering Beer Sheva; Israel
Alípio Mário Jorge - amjorge@fc.up.pt; University of Porto; Porto, Portugal
Ricardo Campos - ricardo.campos@ipt.pt; Polytechnic Institute of Tomar INESC TEC, Portugal; Porto, Portugal
Adam Jatowt - adam.jatowt@uibk.ac.at; University of Innsbruck; Innsbruck, Austria

Program Committee

Natalia Vanetik, Shamoon College of Engineering
Lei Li, Beijing University of Posts and Telecommunications
Bruno Martins, IST and INESC-ID - Instituto Superior Técnico, University of Lisbon
Alvaro Figueira, CRACS / INESC TEC and University of Porto
Horacio Saggion, Universitat Pompeu Fabra
Evelin Amorim, Universidade Federal do Espírito Santo
Yang Zhang, Kyoto University
Moreno La Quatra, Politecnico di Torino
Chaya Liebeskind, Jerusalem College of Technology
Yihong Zhang, Osaka University
Satya Almasian, Heidelberg University
Anubhav Jangra, Indian Institute of Technology Patna
Shiva Pentyala, Salesforce AI
Valia Kordoni, Humboldt University Berlin
Jonathan Schler Holon Institute of Technology (HIT)
Brenda Salenave Santana, Universidade Federal do Rio Grande do Sul
Nuno Guimaraes, CRACS - INESC TEC
Lin Miao, Beijing Information Science & Technology University
Mark Last, Ben-Gurion University of the Negev
Antoine Doucet, University of La Rochelle
Sophie Krimberg, Shamoon College of Engineering
Ignatius Ezeanii, Lancaster University
Sandra Vilas Boas Jardim, Instituto Politécnico de Tomar
Mahmoud El-Haj, Lancaster University
Anastasia Giachanou, Utrecht University

Proceedings Chair

Marina Litvak
Irina Rabaev

Web and Dissemination Chair

Shay Shabtay, Shamoon College of Engineering
Hugo Sousa, NESC TEC & University of Porto

Invited Speakers

20 Years of Analyzing Multilingual Propaganda Content on the Web

Abstract: From the early years of the Internet as a global information infrastructure, multilingual propaganda content has been circulating on the web. For example, the WWW played a critical role in the planning of the 9/11 attacks, both as a source of inspirational information and as a safe means of covert communication between the plotters. Shortly after the tragic events of 2001, our binational team of US and Israeli researchers started to explore the online activities of various hate groups. Initially, we developed a prototype of a monitoring system aimed at detecting the frequent visitors of terrorist websites, which could be influenced by terrorist propaganda and eventually develop into what we call today “the lone wolf attackers”. Shortly after, we focused on another, closely related question: what makes terrorist-generated propaganda content in various languages different from unbiased news reports discussing similar topics? Over the years, we developed prototypes of several additional text analysis tools such as text-summarization algorithms, which can automatically summarize large amounts of untranslated content in any language, as well as AI tools for automated detection of metaphoric language. After presenting the motivational and ethical foundations of our research, I plan to describe some of the methods developed during the last two decades and finally, discuss past and future challenges in this important and fascinating domain.

Speaker: Prof. Mark Last - Founding Director of the Data Science Research Center. Ben-Gurion University of the Negev, Israel.

Bio: Prof. Mark Last is a Full Professor at the Department of Software and Information Systems Engineering, Ben-Gurion University of the Negev, Israel and the Head of the Data Engineering Program. Prof. Last has published over 210 peer-reviewed papers, two monographs, and 11 edited volumes on data mining, text mining, and cyber security. According to Google Scholar, his works were cited more than 6,000 times. He is a Senior Member of the IEEE Computer Society and a Professional Member of the Association for Computing Machinery (ACM). Prof. Last currently serves as an Action Editor of Data Mining and Knowledge Discovery and an Editorial Board Member of Machine Learning Journal and ACM Transactions on Intelligent Systems and Technology. Previously, he has served as an Associate Editor of IEEE Transactions on Systems, Man, and Cybernetics - Part C (2004–2012), Pattern Analysis and Applications (2007- 2016), and IEEE Transactions on Cybernetics (2013-2019). His main research interests are focused on data mining, cross-lingual text mining, soft computing, cyber intelligence, and medical informatics.

The promises and perils of AI Information Retrieval

Abstract: Search engines are the primary way most people access information today, but entering a few keywords and getting a list of results ranked by some unknown function is not ideal. A new generation of Artificial Intelligence-based information access systems, which includes Microsoft’s Bing/ChatGPT, Google/Bard and Meta/LLaMA, is upending the traditional search engine mode of search input and output. These systems are able to take full sentences and even paragraphs as input and generate personalized Natural Language responses. AI systems like ChatGPT and Bard are built on Large Language Models (LLMs). The LLMs-based systems generate personalized responses to fulfill information queries. However, there are plenty of downsides, as well. In this talk, we will focus on AI-based information access systems, weighing on their advantages and disadvantages from the developers', as well as the users' viewpoint.

Speaker: Prof. Dr. Valia Kordoni - Humboldt-Universität Berlin, Germany.

Bio: Valia Kordoni is a Deputy Chair of Computational Linguistics at the Department of English at Humboldt University Berlin. She is an active researcher in Language Technology (LT), Data Science and Artificial Intelligence (AI). Her research interests include multilingual Robust Natural Language Analytics, Computational Semantics, Discourse and Human Cognition Modeling, as well as Machine Learning for the automated acquisition of knowledge, especially concerning multiword units and figurative language and their impact in Natural Language Processing, spoken and written. She has been the president of the ACL (Association for Computational Linguistics) SIGLEX’s (Special Interest Group on Lexicon) MWE (Multiword Expressions) Group. She was the Local Chair of ACL 2016 – The 54th Annual Meeting of the Association for Computational Linguistics. She has coordinated and contributed to many projects funded by the EU, the DFG (Germany), the BMBF (Germany), the DAAD (Germany), as well as the NSF (USA).

The Venue

The conference will be held in Taipei, Taiwan. More information https://sigir.org/sigir2023/

IACT2023 is part of SIGIR'23

Program

Registration

Proceedings