IACT'23 - Taipei, Taiwan
The 1st International Workshop on Implicit Author Characterization from Texts for Search and Retrieval (IACT’23) held in conjunction with the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval
Call for papers
To bring the attention of the research community to the limitations of current models at recognizing and characterizing AI vs. human authors, we propose to organize the first edition of IACT workshops under the umbrella of the SIGIR conference. Research works submitted to the workshop should foster the scientific advance on all aspects of author characterization, including but not limited to the following:
- Differentiation between AI-generated content and human-generated content and bot profiling
- Characterization of conversational agents
- Feature detection of authors for human vs. AI determination
- Prompt understanding and recognition in language models
- Personalized question answering and conversation generation
- Troll identification on social media
- Review authenticity estimation
- Multi-modal, multi-genre, and multilingual author analysis
- Character analysis, description, and representation in narrative texts
- Detecting implicit expressions of sentiment, emotion, opinion, and bias
- Transfer learning for implicit author characterization
- Implicit author characterization annotation schema
- Evaluation of implicit author characterization
- Author characterization in low-resource languages and under-studied domains
- Resources and dataset showcase
- Accountability and regulation of AI-based information extraction, retrieval, and content generation
- Copyright issues of AI-generated content
- Ethical and privacy implications of author characterization and implicit information extraction
- Fairness and bias of AI-generated content
Important Dates
- February 15, 2022: CFP
- Extended to May 23, 2023 (final): Submission deadline
- June 13, 2023: Review due
- June 20, 2023: Acceptance Notification Date
- June 27, 2023: Camera-ready copies
- July 20, 2023: Proceedings online at CEUR
- July 27, 2023: Workshop
Submissions
All papers must be original and not simultaneously submitted to another journal or conference. The following paper categories are welcome:
- Full research papers: up to 8 pages. Original and high-quality unpublished contributions to the theory and practical aspects of the workshop topics.
- Short research papers: up to 5 pages. It can describe ongoing research, resources, and demos.
- Negative result papers: up to 5 pages. Highlighting tested hypotheses that did not get the expected outcome is also welcomed.
- Position papers: up to 5 pages. Discussing current and future research directions.
The submissions must be anonymous and will be peer-reviewed by at least two program committee members.
The authors of accepted papers will be given 15 minutes for a short oral presentation. The workshop will run as a hybrid event to allow virtual attendance and meet the SIGIR format.
Papers must be submitted electronically in PDF format through Easy Chair. All submissions must be in English and formatted according to the one-column CEUR-ART style with no page numbers. Templates in Word or LaTeX can be found in the following zip folder at https://ceur-ws.org/Vol-XXX/CEURART.zip. There is also an Overleaf page for LaTeX users.
Shared Task: ML/NLP Competition on Automatic Classification of Literary Epochs (CoLiE)
To advance the field of implicit temporal information retrieval from a text, this competition aims to challenge participants to develop automatic methods to identify the literary epochs of a given text, which is considered here as an implicit temporal context of a book. The task on Automatic Classification of Literary Epochs (CoLiE) aims at automatic identification of the literary epoch of a given text from its writing style: (1) Romanticism (1798-1837), (2) Victorian Literature (1837-1901), (3) Modernism (1900-1945), (4) Postmodernism (1945-2000), and (5) our days (from 2000).
This competition is open to anyone with a passion for information retrieval, machine learning, and natural language processing. Whether you are a seasoned expert or a newcomer to the field, we welcome you to participate and extend the boundaries of automated text analysis!
Competition site: http://www.kaggle.com/competitions/colie
Competition Timeline
- May 28, 2023: The competition is open to participants. Training and validation sets together with their labels are available.
- July 10, 2023: The test dataset is available.
- July 17, 2023, 23:59 UTC: Final submission deadline.
- July 27, 2023: The winners are announced at the special session at the IACT'23 workshop.
Organization
Organizing Committee
Marina Litvak - marinal@ac.sce.ac.il; Shamoon College of Engineering Beer Sheva; Israel
Irina Rabaev - irinar@ac.sce.ac.il; Shamoon College of Engineering Beer Sheva; Israel
Alípio Mário Jorge - amjorge@fc.up.pt; University of Porto; Porto, Portugal
Ricardo Campos - ricardo.campos@ipt.pt; Polytechnic Institute of Tomar INESC TEC, Portugal; Porto, Portugal
Adam Jatowt - adam.jatowt@uibk.ac.at; University of Innsbruck; Innsbruck, Austria
Program Committee
- Natalia Vanetik, Shamoon College of Engineering
- Lei Li, Beijing University of Posts and Telecommunications
- Bruno Martins, IST and INESC-ID - Instituto Superior Técnico, University of Lisbon
- Alvaro Figueira, CRACS / INESC TEC and University of Porto
- Horacio Saggion, Universitat Pompeu Fabra
- Evelin Amorim, Universidade Federal do Espírito Santo
- Yang Zhang, Kyoto University
- Moreno La Quatra, Politecnico di Torino
- Chaya Liebeskind, Jerusalem College of Technology
- Yihong Zhang, Osaka University
- Satya Almasian, Heidelberg University
- Anubhav Jangra, Indian Institute of Technology Patna
- Shiva Pentyala, Salesforce AI
- Valia Kordoni, Humboldt University Berlin
- Jonathan Schler Holon Institute of Technology (HIT)
- Brenda Salenave Santana, Universidade Federal do Rio Grande do Sul
- Nuno Guimaraes, CRACS - INESC TEC
- Lin Miao, Beijing Information Science & Technology University
- Mark Last, Ben-Gurion University of the Negev
- Antoine Doucet, University of La Rochelle
- Sophie Krimberg, Shamoon College of Engineering
- Ignatius Ezeanii, Lancaster University
- Sandra Vilas Boas Jardim, Instituto Politécnico de Tomar
- Mahmoud El-Haj, Lancaster University
- Anastasia Giachanou, Utrecht University
Proceedings Chair
- Marina Litvak
- Irina Rabaev
Web and Dissemination Chair
- Shay Shabtay, Shamoon College of Engineering
- Hugo Sousa, NESC TEC & University of Porto
Invited Speakers
20 Years of Analyzing Multilingual Propaganda Content on the Web
Abstract: From the early years of the Internet as a global information infrastructure, multilingual propaganda content has been circulating on the web. For example, the WWW played a critical role in the planning of the 9/11 attacks, both as a source of inspirational information and as a safe means of covert communication between the plotters. Shortly after the tragic events of 2001, our binational team of US and Israeli researchers started to explore the online activities of various hate groups. Initially, we developed a prototype of a monitoring system aimed at detecting the frequent visitors of terrorist websites, which could be influenced by terrorist propaganda and eventually develop into what we call today “the lone wolf attackers”. Shortly after, we focused on another, closely related question: what makes terrorist-generated propaganda content in various languages different from unbiased news reports discussing similar topics? Over the years, we developed prototypes of several additional text analysis tools such as text-summarization algorithms, which can automatically summarize large amounts of untranslated content in any language, as well as AI tools for automated detection of metaphoric language. After presenting the motivational and ethical foundations of our research, I plan to describe some of the methods developed during the last two decades and finally, discuss past and future challenges in this important and fascinating domain.
Speaker: Prof. Mark Last - Founding Director of the Data Science Research Center. Ben-Gurion University of the Negev, Israel.
Bio: Prof. Mark Last is a Full Professor at the Department of Software and Information Systems Engineering, Ben-Gurion University of the Negev, Israel and the Head of the Data Engineering Program. Prof. Last has published over 210 peer-reviewed papers, two monographs, and 11 edited volumes on data mining, text mining, and cyber security. According to Google Scholar, his works were cited more than 6,000 times. He is a Senior Member of the IEEE Computer Society and a Professional Member of the Association for Computing Machinery (ACM). Prof. Last currently serves as an Action Editor of Data Mining and Knowledge Discovery and an Editorial Board Member of Machine Learning Journal and ACM Transactions on Intelligent Systems and Technology. Previously, he has served as an Associate Editor of IEEE Transactions on Systems, Man, and Cybernetics - Part C (2004–2012), Pattern Analysis and Applications (2007- 2016), and IEEE Transactions on Cybernetics (2013-2019). His main research interests are focused on data mining, cross-lingual text mining, soft computing, cyber intelligence, and medical informatics.
The promises and perils of AI Information Retrieval
Abstract: Search engines are the primary way most people access information today, but entering a few keywords and getting a list of results ranked by some unknown function is not ideal. A new generation of Artificial Intelligence-based information access systems, which includes Microsoft’s Bing/ChatGPT, Google/Bard and Meta/LLaMA, is upending the traditional search engine mode of search input and output. These systems are able to take full sentences and even paragraphs as input and generate personalized Natural Language responses. AI systems like ChatGPT and Bard are built on Large Language Models (LLMs). The LLMs-based systems generate personalized responses to fulfill information queries. However, there are plenty of downsides, as well. In this talk, we will focus on AI-based information access systems, weighing on their advantages and disadvantages from the developers', as well as the users' viewpoint.
Speaker: Prof. Dr. Valia Kordoni - Humboldt-Universität Berlin, Germany.
Bio: Valia Kordoni is a Deputy Chair of Computational Linguistics at the Department of English at Humboldt University Berlin. She is an active researcher in Language Technology (LT), Data Science and Artificial Intelligence (AI). Her research interests include multilingual Robust Natural Language Analytics, Computational Semantics, Discourse and Human Cognition Modeling, as well as Machine Learning for the automated acquisition of knowledge, especially concerning multiword units and figurative language and their impact in Natural Language Processing, spoken and written. She has been the president of the ACL (Association for Computational Linguistics) SIGLEX’s (Special Interest Group on Lexicon) MWE (Multiword Expressions) Group. She was the Local Chair of ACL 2016 – The 54th Annual Meeting of the Association for Computational Linguistics. She has coordinated and contributed to many projects funded by the EU, the DFG (Germany), the BMBF (Germany), the DAAD (Germany), as well as the NSF (USA).
The Venue
The conference will be held in Taipei, Taiwan. More information https://sigir.org/sigir2023/
IACT2023 is part of SIGIR'23