.csv file with message text (title, text, type) jpg/png screenshots
Collection metrics:
Languages:
15,000 messages 20 days
English, Spanish, French, German, Polish, Czech
Email spam in European languages
The dataset consists of a set of emails divided into two main classes: “spam" and “not spam". E-mails with a length of 50 to 7,500 characters are written in different languages, designed in colloquial and official speech styles
The dataset contains examples of unsolicited text messages, which includes promotional mailings, viral links, microfinance offers and other fraudulent schemes
LLM training to recognize different spam formats, generate, rewrite and perform any other actions on request based on spam texts
-02-
Spam protection in chat applications: NLP to improve the spam filtering system in chats and prevent unwanted messages, advertisements or malicious links, as well as to increase protection and security
-04-
Phishing Protection: Classification for recognizing phishing emails and preventing users from interacting with them
-03-
Preventing text spam in comments: NLP for detecting and blocking spam in comments and ensuring safety and comfort when using mobile applications
-05-
Optimization of marketing campaigns: Classification for automatic filtering of unwanted or fraudulent requests from users and improving the quality and accuracy of marketing campaigns