The content of this repository comprises the work which is being been developed in the scope of RESCUE (RESilient Cloud for EUropE) project. The objective is to develop reusable, modular components to strengthen reliability and recover capabilities for (critical) digital services. Pilot Cyber Resilient Digital Twins for Data Centers and Edges that use open cloud infrastructure and are capable of hosting mission-critical applications at large scale.
This project implements an advanced automated system for fraud investigation and reporting using Large Language Models (LLMs) and machine learning techniques. The system is designed to process incidents, analyze logs, detect anomalies, generate comprehensive reports, and continuously improve its performance.
Below is a high-level overview of the Fraud Investigation System architecture:
+-------------------+ +------------------------+
| Incident Input |----->| Incident Understanding|
+-------------------+ +------------------------+
|
v
+-------------------+ +------------------------+
| Knowledge Base |<---->| RAG Processing |
+-------------------+ +------------------------+
|
v
+-------------------+ +------------------------+
| Log Retrieval |<---->| Anomaly Detection |
+-------------------+ +------------------------+
|
v
+-------------------+ +------------------------+
| Report Generation |<-----| Output Interface |
+-------------------+ +------------------------+
|
v
+-------------------+
| Feedback Loop |
+-------------------+
- Automated incident understanding using LLMs
- Dynamic API call generation for log retrieval
- Asynchronous log retrieval from multiple sources (Elasticsearch, Splunk)
- LLM-powered anomaly detection with statistical analysis
- Automated report generation with visualizations
- Flexible output interface (email, file, API)
- Rate limiting and input validation
- Caching and performance optimization
- Robust error handling and retrying mechanisms
- Web interface for monitoring and manual intervention
- Feedback loop for continuous improvement
- Plugin system for easy extension of functionality
- Performance dashboard for visualizing system performance and trends
- Export of investigation results in various formats (JSON, CSV, XML, Excel)
- External API for submitting incidents and retrieving results
The system consists of the following main components:
- Incident Input Interface
- Incident Understanding Module
- API Call Generator
- Log Retrieval Engine
- Anomaly Detection Module
- Report Generation Module
- Output Interface
- Plugin System
- Feedback Loop
- Performance Dashboard
- Result Exporter
- External API
These components are supported by utility modules for LLM integration, error handling, caching, performance optimization, input validation, and rate limiting.
-
Clone the repository:
git clone https://github.com/AmadeusITGroup/afir.git cd afir
-
Create a virtual environment and activate it:
python -m venv venv source venv/bin/activate # On Windows, use `venv\Scripts\activate`
-
Install dependencies:
pip install -r requirements.txt
-
Set up environment variables for API keys and sensitive information:
export OPENAI_API_KEY=your_openai_api_key export ANTHROPIC_API_KEY=your_anthropic_api_key export HF_API_KEY=your_huggingface_api_key export ES_USERNAME=your_elasticsearch_username export ES_PASSWORD=your_elasticsearch_password export SPLUNK_TOKEN=your_splunk_token export SMTP_USERNAME=your_smtp_username export SMTP_PASSWORD=your_smtp_password
-
Configure the system by editing the YAML files in the
config/
directory:main_config.yaml
: Main system configurationllm_config.yaml
: LLM provider configuration
-
Set up a Redis server for caching (optional):
sudo apt-get install redis-server sudo systemctl start redis-server
-
Start the main system:
python src/main.py
-
Run the web interface:
python src/web_interface.py
-
Start the performance dashboard:
python src/dashboard.py
-
Run the external API:
python src/api.py
- Create a new Python file in the
plugins/
directory. - Implement your plugin logic and a
register_plugin()
function. - The plugin will be automatically loaded by the PluginManager.
Modify the process_feedback()
method in src/feedback_loop.py
to implement custom logic for applying insights and improving the system.
Extend the ResultExporter
class in src/export_results.py
to add new export formats.
- Endpoint:
/api/submit_incident
- Method: POST
- Payload: JSON object with incident details
- Response: Confirmation message with incident ID
- Endpoint:
/api/get_result/<incident_id>
- Method: GET
- Response: Investigation result for the specified incident
- Endpoint:
/api/system_status
- Method: GET
- Response: Current status of the fraud investigation system
Run the unit tests using:
python -m unittest discover tests
The system uses Python's built-in logging module. Log files are stored in the logs/
directory. You can adjust the logging level and output format in the logging_config.yaml
file.
Contributions are welcome! Please feel free to submit a Pull Request.