Local LLM (Ollama, LM Studio, Gemma)
If you're offline, or security policy prevents using cloud LLMs, you can attach an MCP client to a model running on Ollama, LM Studio, etc. and use FindIP as a tool. We recommend 8B+ models with tool-use support — Llama 3.1, Qwen 2.5, Gemma 2, and similar.
Prerequisites
A locally running LLM instance, a FindIP API key, and a client that supports tool calling (e.g. FastMCP, or the OpenAI Python SDK with a custom base URL).
Setup steps
Run a model with Ollama
Pull a model that supports tool calling.
ollama serve ollama pull llama3.1:8b-instruct-q4_K_M # or: ollama pull qwen2.5:7b-instruct
Define the FindIP tool
Register the tool against the OpenAI-compatible API (Ollama listens on localhost:11434/v1).
import os, requests
from openai import OpenAI
llm = OpenAI(base_url="http://localhost:11434/v1", api_key="ollama")
tools = [{
"type": "function",
"function": {
"name": "findip_search",
"description": "Semantic search across patents from KR/US/JP/CN/EP",
"parameters": {
"type": "object",
"properties": {
"query": {"type": "string"},
"top_k": {"type": "integer", "default": 10},
},
"required": ["query"],
},
},
}]
def findip_search(query, top_k=10):
return requests.post(
"https://api.findip.ai/api/v1/search/semantic",
headers={"X-API-Key": os.environ["FINDIP_API_KEY"]},
json={"query": query, "top_k": top_k},
timeout=30,
).json()Tool-call loop
When the model returns tool_calls, execute them for real and feed the results back into the model.
messages = [{"role": "user", "content": "Top 5 solid-electrolyte patents for solid-state batteries"}]
res = llm.chat.completions.create(model="llama3.1:8b-instruct-q4_K_M", messages=messages, tools=tools)
if res.choices[0].message.tool_calls:
for tc in res.choices[0].message.tool_calls:
result = findip_search(**eval(tc.function.arguments))
messages.append(res.choices[0].message)
messages.append({"role": "tool", "tool_call_id": tc.id, "content": str(result)})
final = llm.chat.completions.create(model="llama3.1:8b-instruct-q4_K_M", messages=messages, tools=tools)
print(final.choices[0].message.content)Sample prompt
Prompt
"Pick 5 key Korean / US patents on stability enhancement of perovskite solar cells and summarize them with their main claims."
Troubleshooting
The model never issues a tool call.
Check whether the model supports tool calling at all. Llama 3.1 8B/70B Instruct, Qwen 2.5 7B+, and Gemma 2 27B work reliably. Anything below 7B has a high failure rate.
Responses are too slow.
Letting the local model read and summarize raw data is heavy. Extracting only title and abstract from the FindIP response before passing it back cuts token usage to roughly 1/5.
I want to use the MCP standard directly.
You can connect to https://api.findip.ai/mcp directly via FastMCP or LangChain's MCP adapter. The OAuth flow needs extra work in headless environments though, so the API-key approach is usually simpler.