Kiw Farms Disruption: Methods, Datasets, and Ethics and Forum and Imageboard Discussions

Those with -set:
(1) ANH V. VU, University of Cambridge, Cambridge Cybercrime Center ([email protected]);
(2) Alice Hutchings, University of Cambridge, Cambridge Cybercrime Center ([email protected]);
(3) Ross Anderson, University of Cambridge, and University of Edinburgh ([email protected]).
Link
Abstract and 1 Introduction
2. Deptforming and the effects
2.1. Related work
2.2. The interruption of kiwi farms
3. methods, datasets, and ethics, and 3.1. Discussions on forum and imageeboard
3.2. Telegram chats and 3.3. Web traffic and trends in search of trends
3.4. Tweets made by the online community and 3.5. Data licensing
3.6. Ethical considerations
4. The impact on forum and traffic activity, and 4.1. The impact of major interruptions
4.2. Removal of the platform
4.3. Destruction of traffic
5. The effects of associated stakeholders and 5.1. The community that started the campaign
5.2. The responses to the industry
5.3. The forum operators
5.4. The members of the forum
6. Tensions, challenges, and implications and 6.1. The effectiveness of interruption
6.2. Censorship compared to free speech
6.3. The role of the content -moderation industry
6.4. Policy Implications
6.5. Limitations and jobs in the future
7. Conclusion, Recognition, and Reference
Appendix A.
3. Ways, Datasets, and Ethics
Our basic method is driven by data, with findings supported by the amount of evidence derived from many longitudinal data sources, which we collect regularly. Where the volume of measurements require enrichment – such as when reviewing the relevant public statements of tech firms directly involved in the interference, and announcements made by forum operators – we use an excellent content review.
3.1. Discussions on forum and imageeboard
Apart from common major social media channels such as Facebook and Twitter, independent platforms such as Xenforo[4] and nausea[5] Gained popularity as tools for developing online communities. Despite the less visible and require more care, these can offer greater resistance against external intervention as operators have full control over the content and databases, thus allowing easy backup and redeployment in case of interruption. These platforms usually share a hierarchical data structure from bulletin boards to threads linked to specific topics, each containing many posts. As the free speech is facilitated, they further ease and spread hate and abusive speech. We have scraped two most active forums associated with online harassment for many years because of their increasingly toxic content, as part of the ExtremeBB Dataset [62]: Kiwi farms at lolcow farm.
Our collection includes not only posts but also associated metadata such as posting time, user profiles, reactions, and poisoning levels, identity attacks and threats measured by the Google Perspective API in January 2023.[6] Perspective API also offers other steps such as insults and vulgarity [63]But we will exclude them because of the lack of relationship with the purpose of this paper. This API uses crowdsourced annotations for model training and are large alternatives to alternatives [64]. We are working to ensure completion of data by designing our scrapers to visit all sub-forums, threads, and posts while monitoring each development of each crawl to continue increasing the case in case of any interruption. The summary of forum discussion data is shown in Table 1.
The Kiwi Farms has been built in Xenforo, but operators are maintaining the forum through their own efforts since late 2021 when Xenforo officially revoked their license. Our data covers the entire history of the forum from early January 2013 to the end of 2022 with 10.1M posts in 48.3k threads made by 59.2K active user, providing a full scene by its evolution over time. While some extremist forums have experienced activity that changes and rapidly rejects in recent years [62]Kiwi Farms showed steady growth until it was significantly interrupted in 2022 (see Figure 1). Our data accurately captured the mainly reported suspensions, including those in 2017 and 2022.
Kiwi Farms's main rival is the Lolcow Farm, an imageboard built on Infinity [65], [66]. While the discussions of Kiwi Farms are mainly based on the text, the Lolcow Farm is centered on the descriptive images. While Kiwi Farms users have adopted pseudonyms, Lolcow Farm users often remain hidden under the 'unidentified' handle. We gathered a complete snapshot of LOLCOW Farm from its start in June 2014 to the end of 2022, covering 4.6M posts made in 10.0K threads. The Lolcow farm has fewer threads, but each usually contains many posts. This collection carries a total number of posts for both forums at 14.7m (and is still growing). We excluded Lolcow, a smaller competitor with Kiwi Farms (also based on Xenforo), as it disappeared in the middle of2022 and had less than 30K posts in total. Like the Lolcow Farm now the biggest competitor, its review gives us to estimate the platform removal when Kiwi Farms goes down.
3.2. Telegram chats
In non -accessible periods, the level of activity was increased in telegram groups associated with kiwi farms. There are two channels: one is primarily used by forum operators to spread announcements and updates, especially about where and when to access the forum; And one is adopted by forum users mainly for normal discussions. Both channels allow access to the public, allowing people to join and view historical messages. We used the telethon[7] To collect a snapshot of these channels throughout their lifetime until the end of 2022, covering 525k messages, 298k responses, and associated metadata such as views of view and 356k emoji reactions made by 2 502 active users. The data is likely to be complete as our scraper runs near real time, and messages with metadata are fully captured by using official Telegram telegrams. While forum operators are highly incentive to maintain users' quick knowledge, their announcements provide a reliable incident and timeline of response.
3.3. Web traffic and trends in search of trends
We found from announcements in the Telegram team that kiwi farms could be accessible by six main domains: the main one is kiwifarms.net and four successors are kiwifarms.ru, kiwifarms.top, kiwifarms.is, and kiwifarms.st, while a plertoma decentralized web version is in the kiwifarms.CC.[8] To investigate how users were navigating these domains when the forum experienced interruption, we reviewed traffic analytics towards all six domains provided by Gameb -the leading platform on the market that provides insights and intelligence in traffic and web performance.[9] Their reports combined with anonymous statistics from many inputs, including their own analytic services, data sharing from ISPs and other measuring companies, data that crawled from billions of websites, and device traffic (both website and app) such as plugins, add-on and pixel monitoring. Their algorithm then extrapolates the huge combined data throughout the internet space. Their estimation therefore may not be completely accurate, but reliable to reflect the trends in both the global and country levels. To test reliability, we sent our own infrastructure to collect more than 19M ground-ground traffic records for six months, combining them in a 30-minute session after compared to visits to similar visits. We have found that while dropping the amount of traffic due to how repeated pageviews are counted, similar NawEB can capture trends with a strong positive linear relationship (Pearson correlation coefficient R = 0.83). Our review of the next section also suggests a high touch between traffic data and forum activity
As not offering a similar academic license, we use a free trial account[10] To access longitudinal web traffic and contact data that will return over the past three months. This includes information about total visit, unique visitors, visit duration, pages per visit, bounce rate, and page views. It also provides numbers about search activity, marketing data such as visiting sources (for example, directly, search, email, social, referral, ad), and non-explosive geographic and demographic perspectives of the audience. These data, which cover both desktop and mobile traffic, provide an important insight. They cover from July to December 2022, two months before and four months after interruption; The time of this time is sufficient because there is no significant industry intervention against the forum in the past (as shown in Fig. 1), and the interruption campaign is almost ended after a few months (see §4). In addition, we have also collected trends in the search for countries and territories over time from Google's trends, covering the entire life of the forum. Both of these datasets are likely to be complete because they are gathered directly from Galaweb and Google.
[4] The Xenforo platform: [5] The Infinity ImageBoard: https://github.com/ctrllcrllv/infinity/ [6] Google Perspective API: