lakeFS

lakeFS provides scalable version control over the data lake, and uses Git-like semantics to create and access those versions.

This notebooks covers how to load document objects from a lakeFS path (whether it's an object or a prefix).

Initializing the lakeFS loader

Replace ENDPOINT, LAKEFS_ACCESS_KEY, and LAKEFS_SECRET_KEY values with your own.

from langchain_community.document_loaders import LakeFSLoader

API Reference:LakeFSLoader

ENDPOINT = ""
LAKEFS_ACCESS_KEY = ""
LAKEFS_SECRET_KEY = ""

lakefs_loader = LakeFSLoader(
    lakefs_access_key=LAKEFS_ACCESS_KEY,
    lakefs_secret_key=LAKEFS_SECRET_KEY,
    lakefs_endpoint=ENDPOINT,
)

Specifying a path

You can specify a prefix or a complete object path to control which files to load.

Specify the repository, reference (branch, commit id, or tag), and path in the corresponding REPO, REF, and PATH to load the documents from:

REPO = ""
REF = ""
PATH = ""

lakefs_loader.set_repo(REPO)
lakefs_loader.set_ref(REF)
lakefs_loader.set_path(PATH)

docs = lakefs_loader.load()
docs

Document loader conceptual guide
Document loader how-to guides

Initializing the lakeFS loader​

Specifying a path​

Related​

Was this page helpful?

Initializing the lakeFS loader

Specifying a path

Related