Skip to content
Sanoj Raj Shrestha edited this page Jun 25, 2021 · 3 revisions

Major-Project

A central repo for sharing resources, workflows and roadmaps. Learn how to edit a markdown file and add more resourses if found. Check the Projects section for project managemet plans.

Roadmap

  1. Project proposal (In development)
  2. Scrape data from all sites mentioned below. (In development)
  3. Clean the data and remove unnecessary variables and noise. (Pending)
  4. Perfrom EDA and find out the relationships between variables. (Pending)
  5. Build a model to predict/give an objective price of the house based on the parameters. (Pending)
  6. Deploy the model using flask(Api), HTML, CSS, JS and PHP/Laravel(Backend). (Pending)
  7. Documentation (Pending)

INFROMATION AND RESOURCES

Tutorial videos

Important programming concepts

  • creating and importing a python module and using its functions (files)
  • namespace of functions in a python module
  • how functions from other python modules are called
  • Python data types: Lists, Tuples, dictionary and sets
  • for loops, map function
  • method call from a library vs function call ie how obj.method() works
  • lambda function/closures/annonyomous function
  • Python virtual environmets
  • Basics of Object oriented python. Python class constructors. How a function gets access to the object it is called upon and how it manipulates the object.
  • What is anacodna?
  • How pip works

Important Statistics Concepts (Add more)

  • Co-relation between variables
  • Linear/Multiple/Logistic reression
  • Squared error of regression line
  • p-value, level of significance, null and alternate hypothesis.
  • Cost function and Gradient descent
  • Fit a line using least square method.

Important Machine learning Concepts

  • One hot encoding
  • Dummy variable trap

Algorithms (listed possible best to worst)

  1. Linear regression - Related = Wald's test
  2. Convolution neural networks
  3. Random forest
  4. ID3

Use waka as a testing tool and may use external librariess to check efficiency of the model.

Miscellaneous

Libraries (Must read official docs)

Go through the basic contents of the docs atleast once

Websites to scrape

Always try to check if the data is in kaggle before writing a script yourself.

  1. 99aana - BS4
  2. Nepal Homes - Might need Selenium
  3. Hamrobazar
  4. Gharbazar - Selenium
  5. Basobaas - Selenium (found on kaggle)
  6. 1Ropani - BS4
  7. Gharghaderi - Selenium
  8. Housing Nepal - BS4
  9. Real Estate In Nepal - BS4
  10. Nepal Home Search -BS4
  11. Nepal Realestates -BS4 very low amount of data
  12. The Realtors - search with selenium and scrapte with BS4. Scan title for house keyword. Not properly catagorized
  13. GharJagga Nepal - BS4

Parameters (subject to change)

  1. Price
  2. Location in District
  3. Number of rooms
  4. Number of floors
  5. Area
  6. Time posted
  7. Road ahead of the house *
  8. Room size *
  9. Bedrooms *
  10. Bathrooms *
  11. Garage *
  12. Car parking *
  13. kitchen *
  14. Living room *
  15. Furnished ? *