Projects and Activities
Research Statistician - Master Information, Milan
- Statistical modeling: data preparation, analysis, and model creation using microeconomic and macroeconomic data.
- Time series, forecasting, and geomarketing.
- Risk analysis: development of methodologies and application of environmental risks, including hazard, vulnerability, and exposure associated with Italian areas.
- App development: creation of Shiny apps in R for geographical and statistical analysis.
- Data visualization: data acquisition, preparation, transformation, and dashboarding.
- Technologies: R, RStudio, Shiny, Python, SQL Server, PostgreSQL, Spotfire, Power BI.
Data Scientist / Data Analyst / Business Intelligence Developer - Major Bit Consulting, Rome
- IT consultancy: database management and creation, data loading, manipulation and transformation, data quality, data warehouse maintenance, and datamart construction.
- Code analysis: detailed analysis of customer-supplied code to find and fix errors and optimize performance.
- Code development in SAS, Python, and SQL.
- Machine learning: building, testing, and deployment of supervised and unsupervised machine-learning models with Python and SAS Viya, including artificial neural networks, classification models, regression models, random forest, SVM, k-means, and PCA.
Italian Air Force - REGISCC: Object Detection with YOLOv7
Objective: Create an interface to visualize, in real time, an output video with identified and classified objects, object counts per second, historical plots of detected objects, and the source location of the input video.
Activities:
- Training of a deep-learning object detection model using YOLOv5 and YOLOv7.
- Solving overfitting and model optimization problems.
- Integration of the back-end with the database.
- Real-time connection between Python and MySQL or Oracle to save inference outputs.
- Connection between the database and the front-end to process KPIs, graphs, and alerts.
- Live-stream integration using Flask and video transmission protocols.
- Saving user-selected frames of interest.
Technologies: Anaconda, Python 3.8, TensorFlow v2, YOLOv4, YOLOv5, YOLOv7, Keras, Darknet, Spyder, Google Colab, Jupyter Notebook, GPU/CUDA, PyTorch, MS Azure, VS Code.
Terna - Maintenance of the National Electric Network
Objective: Estimate the probability that a catenary will freeze and predict vegetation growth threatening the catenary.
Activities:
- Data preprocessing and construction of the modeling data structure.
- Building machine-learning pipelines with neural networks, SVM, random forest, gradient boosting, and logistic regression.
- Applying variable transformations and parameter tuning to select a champion model.
- Preparing reports and dashboards with probabilities and metrics such as accuracy, precision, and recall.
Technologies: Python, SAS Viya, SAS Model Studio, SAS Model Manager, SAS Visual Analytics, MS Azure.
Internal Project - Convolutional Neural Network for Classification and Localization
Objective: Build and train a convolutional neural network to classify and locate objects in images.
Activities:
- Image, label, and coordinate preprocessing.
- Network architecture design, model creation, training, and parameter saving.
- Prediction of object category and bounding-box design.
- Goodness-of-fit assessment and hyperparameter tuning.
Technologies: Python, TensorFlow, Keras, Scikit-learn, OpenCV, Spyder, Jupyter.
Banking Field - Reporting Flow
Objective: Identify categories of data within encoded text files to generate multiple outputs for Bank of Italy report checks.
Activities:
- Creation of Python functions and classes to capture the data structure.
- Quality control of row component fields.
- Use of PySpark RDDs to process and transform data through a Hadoop environment.
- Generation of XML, CSV, log, and DAT output files.
Technologies: Python, PySpark, Hadoop.
Credit Agricole - Insurance Policies
Objective: Data quality control and lineage creation for input-output data flows.
Activities:
- Maintenance and update of daily, weekly, and monthly data using SAS Base, SAS Macro, SAS Enterprise Guide, SAS Data Integration, and SAS Visual Analytics.
- Reporting for customer portfolio management and analysis.
- SAS code optimization to reduce processing time.
- Code structure and project-flow changes according to customer requests.
- Resolution of daily and monthly data inconsistency issues.
- Analysis, design, and development of new processing flows.
Technologies: SAS Studio, SAS Enterprise Guide, SAS Base, SAS Macro, SQL, SAS Visual Analytics.
Internal Project - Unsupervised Machine Learning for Document Grouping
Objective: Analyze selected document text to create similarity-based document groups.
Activities:
- Build, train, and test an unsupervised machine-learning model with Python.
- Construct a sparse matrix to transform text into numerical values through vectorization.
- Apply cluster analysis using k-means.
- Test the model by connecting to an email server, accessing message content, and applying clustering.
- Create a folder for each group to store classified mail content.
Technologies: Anaconda, Python, Scikit-learn, VS Code, Keras, k-means.