Energy Engineer (UnB) │ Data Scientist and Analytics (USP)
The programming languages used were PYTHON and VBA EXCEL.
Seeking to simplify and make clear the flow of activities to obtain the final product, this notebook has been divided into 4 parts.
PART 1 - INTRODUCTION
- Containing a short summary of how the project was developed, the basics and bibliography.
PART 2 - EXCEL VBA
- Introducing the
VBA
formulas developed in excel to be called inpython
.
PART 3 - PYTHON
- Presenting the
python
code used for this project.
PART 4 - CONCLUSION
- Final considerations of the project.
- https://www.automateexcel.com/vba-code-examples/
- https://www.rondebruin.nl/index.htm
- https://www.xlwings.org/
- https://pandas.pydata.org/docs/
- http://timgolden.me.uk/pywin32-docs/contents.html
- https://github.com/wesm/pydata-book
- https://github.com/fzumstein/python-for-excel
-
At first, the files are only available in excel
".XLS"
format, under the name"vendas-combustíveis-m3.xls"
. -
Within this initial file, there are two pivot tables that are the target. These are:
-
Pivot Table 1 ) "Vendas, pelas distribuidoras, dos derivados combustíveis de petróleo por Unidade da Federação e produto - 2000-2020 (m3)"
-
Pivot Table 2 ) "Vendas, pelas distribuidoras, de óleo diesel por tipo e Unidade da Federação - 2013-2020 (m3)"
-
-
This data, presented by the pivot tables, does not have its data source easily accessible in another spreadsheet. Also, the data is not available through the Excel shortcut: PivotTableTools>Analyze>Change Data Source. This shows the need to extract them using Excel's own
VBA
programming language. The advantage of extracting them this way is not only the reduced time for processes that could be long, but the possibility of applying them viapython
, through thexlwings library
.-
The worksheet, once opened, has by default only one sheet, called "plan1".
-
The macros created in VBA are available in the folder
"\ANP-PROJECT\Codigos_VBA"
. -
To extract this data, 4 macros were created in VBA. These are presented and described in PART 2 EXCEL
-
-
Once all the VBA - MACROS have been created, they can be called by
python
and applied there viaxlwings library
. -
After applying the Macros on python, the end products of the extraction are two files in
"CSV-UTF8"
: -
These files were managed via the
Pandas Library
from python. Having the descriptive in PART 3 PYTHON -
Finally, the final product of this project is two files in
"CSV-UTF8"
available in the folder"\ANP-PROJECT\Planilhas Finais"
, according to the following table:
Column | Type |
---|---|
year_month | date |
uf | string |
product | string |
unit | string |
volume | double |
created_at | timestamp |