Invited Research Seminar
By Gabriela Ferraro
Extracting Normative Rules from Legal Texts: Challenges (short presentation)
Laws and regulations govern our daily lives, and we often require expert’s consult to deal with them. Having an automated system capable of interpreting and reasoning about laws and regulations from legal documents would indeed ease the burden required with dealing with the subject. However, a required first step in order to automatise the process, is to extract such laws and regulations in a formalised format that a machine can interpret and reason about. To do so, it is necessary to extract normative rules from sentences written in natural language.
In this talk, I will present some of the NLP related challenges of extracting normative rules from sentences, and explore promising research avenues to approach this problem.
Transfer Learning for Hate Speech Detection in Social Media (long presentation)
In today's society more and more people are connected to the Internet, and its information and communication technologies have become an essential part of our everyday life. Unfortunately, the flip side of this increased connectivity to social media and other online content is cyber-bullying and -hatred, among other harmful and anti-social behaviors. Models based on machine learning and natural language processing provide a way to detect this hate speech in web text in order to make discussion forums and other media and platforms safer. The main difficulty, however, is annotating a sufficiently large number of examples to train these models. In this paper, we report on developing automated text analytics methods, capable of jointly learning a single representation of hate from several smaller, unrelated data sets.
We train and test our methods on the total of 37,520 English tweets that have been annotated for differentiating harmless messages from racist or sexists in the first detection task, and hateful or offensive tweets in the second detection task.
Our most sophisticated method combines a deep neural network architecture with transfer learning. Its prediction correctness is the macro-averaged F1 of 78% and 72% in the first and second task, respectively. This method enables generating an interpretable two-dimensional text visualization - called the Map of Hate - that is capable of separating different types of hate speech and explaining what makes text harmful. These methods and insights hold a potential for not only safer social media, but also reduced need to expose human moderators and annotators to distressing online-messaging.
Gabriela Ferraro is a research scientist in Natural Language Processing and Computational Linguistics at the Commonwealth Science and Industrial Research Organization in Australia, and an Adjunct Research Fellow in the College of Engineering and Computer Science at the Australian National University.
She received a BA in Computer Science from Champagnat University (Argentina) in 2004, and a Master degree in Science of Language and Applied Linguistics from Universitat Pompeu Fabra (Spain) in the following year. In 2006, she joined the TALN Research Group from the same university and received a PhD under the supervision of Leo Wanner in 2012. In 2013, she joined the Machine Learning Research Group at NICTA for four years, and the Australian National University, where she is been a lecturer ever since.
Host: Leo Wanner