File Upload Support¶
Hayhooks provides built-in support for handling file uploads in your pipelines, making it easy to create applications that process documents, images, and other files.
Overview¶
File upload support enables you to:
- Accept file uploads through REST APIs
- Process multiple files in a single request
- Combine file uploads with other parameters
- Build document processing and RAG pipelines
Basic Implementation¶
To accept file uploads in your pipeline, add a files
parameter to your run_api
method:
from fastapi import UploadFile
from typing import Optional, List
class PipelineWrapper(BasePipelineWrapper):
def run_api(self, files: Optional[List[UploadFile]] = None, query: str = "") -> str:
if not files:
return "No files provided"
# Process files here...
return f"Processed {len(files)} files"
For a complete implementation, see the RAG System Example.
API Usage¶
Multipart Form Data Requests¶
File uploads use multipart/form-data
format:
# Upload single file
curl -X POST \
http://localhost:1416/my_pipeline/run \
-F 'files=@document.pdf' \
-F 'query="Summarize this document"'
# Upload multiple files
curl -X POST \
http://localhost:1416/my_pipeline/run \
-F 'files=@document1.pdf' \
-F 'files=@document2.txt' \
-F 'query="Compare these documents"'
# Upload with additional parameters
curl -X POST \
http://localhost:1416/my_pipeline/run \
-F 'files=@document.pdf' \
-F 'query="Analyze this document"' \
-F 'temperature=0.7'
Python Client Example¶
import requests
# Upload files
files = [
('files', open('document.pdf', 'rb')),
('files', open('notes.txt', 'rb'))
]
data = {
'query': 'Analyze these documents',
'temperature': '0.7'
}
response = requests.post(
'http://localhost:1416/my_pipeline/run',
files=files,
data=data
)
print(response.json())
CLI Usage¶
Hayhooks CLI supports file uploads:
# Upload single file
hayhooks pipeline run my_pipeline --file document.pdf --param 'query="Summarize this"'
# Upload directory
hayhooks pipeline run my_pipeline --dir ./documents --param 'query="Analyze all documents"'
# Upload multiple files
hayhooks pipeline run my_pipeline --file doc1.pdf --file doc2.txt --param 'query="Compare documents"'
# Upload with parameters
hayhooks pipeline run my_pipeline --file document.pdf --param 'query="Analyze"' --param 'temperature=0.7'
Combining Files with Other Parameters¶
You can handle both files and parameters in the same request by adding them as arguments to the run_api
method:
from fastapi import UploadFile
from typing import Optional, List
class PipelineWrapper(BasePipelineWrapper):
def run_api(
self,
files: Optional[List[UploadFile]] = None,
query: str = "",
additional_param: str = "default"
) -> str:
if files and len(files) > 0:
filenames = [f.filename for f in files if f.filename is not None]
return f"Received files: {', '.join(filenames)} with query: {query}"
return "No files received"
Complete Example: RAG System with File Upload¶
For a complete, production-ready example of a RAG system with file uploads, including document indexing and querying with Elasticsearch, see:
- RAG System Example - Full RAG implementation guide
- examples/rag_indexing_query - Complete working code with:
- Document indexing pipeline with file upload support
- Query pipeline for retrieving and generating answers
- Elasticsearch integration
- Support for PDF, Markdown, and text files
Next Steps¶
- PipelineWrapper - Learn about wrapper implementation
- Examples - See working examples
- CLI Commands - CLI usage for file uploads