MLOps Course
📬 Get updates when I publish new content
Choose the topics you're interested in: Tech, Life Management, or Spiritual Growth
This isn't the right course for me, but I've started, so I'm going to keep going for a bit.
Focused on Data Prep and building the prompt.
First 2 videos are very basic definitions and overview.
Gets started in the 3rd video:
- Jupyter Notebook setup for Python
- talking about sql, big query, and stack overflow public dataset
sql - use limit to restrict what returns.
Pandas is a must -- need to refresh.
What data in the warehouse is too large for memory on laptop?
data lineage: where did data come from?
File formats?
- JSONL, JSON lines. Idea for small to medium datasets
- TFRecord. for large datasets
- Parquet. good for large and complex datasets
Versioning of datasets is important: prefix combined with data_time stamp works well.