Skip to content

Latest commit

 

History

History
86 lines (53 loc) · 4.19 KB

README.md

File metadata and controls

86 lines (53 loc) · 4.19 KB

ANP-PROJECT - [EN 🇬🇧]


Energy Engineer (UnB)Data Scientist and Analytics (USP)



PART 1 - INTRODUCTION

The programming languages used were PYTHON and VBA EXCEL.

Seeking to simplify and make clear the flow of activities to obtain the final product, this notebook has been divided into 4 parts.

PART 1 - INTRODUCTION

  • Containing a short summary of how the project was developed, the basics and bibliography.

PART 2 - EXCEL VBA

  • Introducing the VBA formulas developed in excel to be called in python.

PART 3 - PYTHON

  • Presenting the python code used for this project.

PART 4 - CONCLUSION

  • Final considerations of the project.

1.1 AUXILIARY BIBLIOGRAPHY

1.2 PROJECT FLOW

  1. At first, the files are only available in excel ".XLS" format, under the name "vendas-combustíveis-m3.xls".

  2. Within this initial file, there are two pivot tables that are the target. These are:

    • Pivot Table 1 ) "Vendas, pelas distribuidoras, dos derivados combustíveis de petróleo por Unidade da Federação e produto - 2000-2020 (m3)"

    • Pivot Table 2 ) "Vendas, pelas distribuidoras, de óleo diesel por tipo e Unidade da Federação - 2013-2020 (m3)"

  3. This data, presented by the pivot tables, does not have its data source easily accessible in another spreadsheet. Also, the data is not available through the Excel shortcut: PivotTableTools>Analyze>Change Data Source. This shows the need to extract them using Excel's own VBA programming language. The advantage of extracting them this way is not only the reduced time for processes that could be long, but the possibility of applying them via python, through the xlwings library.

    • The worksheet, once opened, has by default only one sheet, called "plan1".

    • The macros created in VBA are available in the folder "\ANP-PROJECT\Codigos_VBA".

    • To extract this data, 4 macros were created in VBA. These are presented and described in PART 2 EXCEL

  4. Once all the VBA - MACROS have been created, they can be called by python and applied there via xlwings library.

  5. After applying the Macros on python, the end products of the extraction are two files in "CSV-UTF8":

  6. These files were managed via the Pandas Library from python. Having the descriptive in PART 3 PYTHON

  7. Finally, the final product of this project is two files in "CSV-UTF8" available in the folder "\ANP-PROJECT\Planilhas Finais", according to the following table:

Column Type
year_month date
uf string
product string
unit string
volume double
created_at timestamp