Updated the Windows PC encoding fix with thanks to CG and Jon R

This commit is contained in:
Edward Donner
2024-10-29 21:36:29 -04:00
parent 86763f2fcb
commit 284336b3b9
6 changed files with 37 additions and 13 deletions

View File

@@ -87,8 +87,10 @@
"\n",
"folders = glob.glob(\"knowledge-base/*\")\n",
"\n",
"# With thanks to Jon R, a student on the course, for this fix needed for some users \n",
"text_loader_kwargs={'autodetect_encoding': True}\n",
"# With thanks to CG and Jon R, students on the course, for this fix needed for some users \n",
"text_loader_kwargs = {'encoding': 'utf-8'}\n",
"# If that doesn't work, some Windows users might need to uncomment the next line instead\n",
"# text_loader_kwargs={'autodetect_encoding': True}\n",
"\n",
"documents = []\n",
"for folder in folders:\n",
@@ -148,7 +150,9 @@
"\n",
"Another example of an Auto-Encoding LLMs is BERT from Google. In addition to embedding, Auto-encoding LLMs are often used for classification.\n",
"\n",
"More details in the resources."
"### Sidenote\n",
"\n",
"In week 8 we will return to RAG and vector embeddings, and we will use an open-source vector encoder so that the data never leaves our computer - that's an important consideration when building enterprise systems and the data needs to remain internal."
]
},
{