Spaces:
Running
Running
Abid Ali Awan
commited on
Commit
Β·
dc31452
1
Parent(s):
c75d8bc
Enhance regulatory query handling by implementing improved detection for new regulatory questions versus follow-ups. Update parameter extraction to include report type, refining response generation based on user intent. Revise README to reflect these new features and clarify functionality. Update UIHandler to incorporate conversation context in general chat responses for better user experience.
Browse files- README.md +15 -1
- agents/reg_radar.py +115 -39
- agents/ui_handler.py +28 -3
- app.py +9 -2
README.md
CHANGED
|
@@ -18,13 +18,27 @@ RegRadar is an AI-powered regulatory compliance assistant that monitors global r
|
|
| 18 |
[](https://www.youtube.com/watch?v=v0lZMx_Yt2I)
|
| 19 |
|
| 20 |
## π Features
|
|
|
|
| 21 |
- **Automatic Query Type Detection**: Understands if your message is a regulatory compliance query or a general question, and selects the right tools.
|
| 22 |
-
- **Information Extraction**: Extracts key details (industry, region, keywords) from your queries for precise analysis.
|
|
|
|
| 23 |
- **Regulatory Web Crawler**: Crawls official regulatory websites (e.g., SEC, FDA, FTC, ESMA, BIS) for recent updates and compliance changes (last 30 days).
|
| 24 |
- **Regulatory Search Engine**: Searches across multiple sources for industry-specific compliance information and aggregates results.
|
| 25 |
- **Memory System**: Remembers past queries and responses, personalizing results for each session/user.
|
| 26 |
- **AI Analysis Engine**: Summarizes findings and generates actionable compliance recommendations and executive summaries.
|
| 27 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 28 |
## π Getting Started
|
| 29 |
|
| 30 |
Follow these steps to set up and run RegRadar locally:
|
|
|
|
| 18 |
[](https://www.youtube.com/watch?v=v0lZMx_Yt2I)
|
| 19 |
|
| 20 |
## π Features
|
| 21 |
+
- **Improved Regulatory Query Detection**: Now distinguishes between new regulatory/compliance/update questions and follow-up or general questions. Only new regulatory questions trigger compliance workflows; follow-ups and general queries are handled as general chat.
|
| 22 |
- **Automatic Query Type Detection**: Understands if your message is a regulatory compliance query or a general question, and selects the right tools.
|
| 23 |
+
- **Information Extraction**: Extracts key details (industry, region, keywords, and report type) from your queries for precise analysis.
|
| 24 |
+
- **Smart Report Type Detection**: Automatically determines if you want a quick answer, a summary, or a full compliance report based on your query. The detected report type is shown in the parameter extraction step and controls the style and length of the AI's response.
|
| 25 |
- **Regulatory Web Crawler**: Crawls official regulatory websites (e.g., SEC, FDA, FTC, ESMA, BIS) for recent updates and compliance changes (last 30 days).
|
| 26 |
- **Regulatory Search Engine**: Searches across multiple sources for industry-specific compliance information and aggregates results.
|
| 27 |
- **Memory System**: Remembers past queries and responses, personalizing results for each session/user.
|
| 28 |
- **AI Analysis Engine**: Summarizes findings and generates actionable compliance recommendations and executive summaries.
|
| 29 |
|
| 30 |
+
## π¦ How It Works
|
| 31 |
+
When you submit a query, RegRadar:
|
| 32 |
+
1. Detects if your message is a **new** regulatory/compliance question (not a follow-up or general question).
|
| 33 |
+
2. If yes, extracts industry, region, keywords, and report type.
|
| 34 |
+
3. If no, processes your message as a general or follow-up query.
|
| 35 |
+
4. Runs the appropriate regulatory search/crawl and memory lookup if regulatory.
|
| 36 |
+
5. Shows the extracted parameters, including the report type, in the UI for transparency.
|
| 37 |
+
5. Generates a response matching your intent:
|
| 38 |
+
- **Quick**: Direct, brief answer to specific questions.
|
| 39 |
+
- **Summary**: Short summary for summary requests.
|
| 40 |
+
- **Full**: Comprehensive report (default for vague or broad queries).
|
| 41 |
+
|
| 42 |
## π Getting Started
|
| 43 |
|
| 44 |
Follow these steps to set up and run RegRadar locally:
|
agents/reg_radar.py
CHANGED
|
@@ -29,7 +29,7 @@ class RegRadarAgent:
|
|
| 29 |
return "search", "Regulatory Search"
|
| 30 |
|
| 31 |
def extract_parameters(self, message: str) -> Dict:
|
| 32 |
-
"""Extract industry, region, and
|
| 33 |
# Expanded lists for industries and regions
|
| 34 |
industries = [
|
| 35 |
"fintech",
|
|
@@ -86,34 +86,93 @@ class RegRadarAgent:
|
|
| 86 |
industries_str = ", ".join(industries)
|
| 87 |
regions_str = ", ".join(regions)
|
| 88 |
prompt = f"""
|
| 89 |
-
Extract the following information from the user query below and return ONLY a valid JSON object with keys: industry, region, keywords.
|
| 90 |
- industry: The industry mentioned or implied. Choose from: {industries_str} (or specify if different).
|
| 91 |
- region: The region or country explicitly mentioned. Choose from: {regions_str} (or specify if different).
|
| 92 |
- keywords: The most important regulatory topics or terms, separated by commas. Do NOT include generic words or verbs.
|
|
|
|
| 93 |
|
| 94 |
User query: {message}
|
| 95 |
|
| 96 |
-
Example
|
| 97 |
-
{{"industry": "AI", "region": "EU", "keywords": "AI Act, data privacy"}}
|
|
|
|
|
|
|
| 98 |
"""
|
| 99 |
response = call_llm(prompt)
|
| 100 |
try:
|
| 101 |
params = json.loads(response)
|
| 102 |
except Exception:
|
| 103 |
-
# fallback:
|
| 104 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 105 |
return params
|
| 106 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 107 |
def is_regulatory_query(self, message: str) -> bool:
|
| 108 |
-
"""Detect if this is a regulatory, compliance, or update-related question
|
|
|
|
|
|
|
| 109 |
intent_prompt = f"""
|
| 110 |
-
Is the following user message a regulatory, compliance, or update-related question (
|
|
|
|
| 111 |
Message: {message}
|
| 112 |
Respond with only 'yes' or 'no'.
|
| 113 |
"""
|
| 114 |
|
| 115 |
intent = call_llm(intent_prompt).strip().lower()
|
| 116 |
-
return
|
| 117 |
|
| 118 |
def process_regulatory_query(
|
| 119 |
self, message: str, params: dict = None, user_id: str = "user"
|
|
@@ -139,10 +198,12 @@ class RegRadarAgent:
|
|
| 139 |
"params": params,
|
| 140 |
"crawl_results": crawl_results,
|
| 141 |
"memory_results": memory_results,
|
|
|
|
| 142 |
}
|
| 143 |
|
| 144 |
def generate_report(self, params, crawl_results, memory_results=None):
|
| 145 |
-
"""Generate a
|
|
|
|
| 146 |
memory_context = ""
|
| 147 |
if memory_results:
|
| 148 |
# Format memory results for inclusion in the prompt (limit to 3 for brevity)
|
|
@@ -166,34 +227,49 @@ class RegRadarAgent:
|
|
| 166 |
by_source[source] = []
|
| 167 |
by_source[source].append(result)
|
| 168 |
|
| 169 |
-
|
| 170 |
-
|
| 171 |
-
|
| 172 |
-
|
| 173 |
-
|
| 174 |
-
|
| 175 |
-
|
| 176 |
-
|
| 177 |
-
|
| 178 |
-
|
| 179 |
-
|
| 180 |
-
|
| 181 |
-
|
| 182 |
-
|
| 183 |
-
|
| 184 |
-
|
| 185 |
-
|
| 186 |
-
|
| 187 |
-
|
| 188 |
-
|
| 189 |
-
|
| 190 |
-
|
| 191 |
-
|
| 192 |
-
|
| 193 |
-
|
| 194 |
-
|
| 195 |
-
|
| 196 |
-
|
| 197 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 198 |
|
| 199 |
return stream_llm(summary_prompt)
|
|
|
|
| 29 |
return "search", "Regulatory Search"
|
| 30 |
|
| 31 |
def extract_parameters(self, message: str) -> Dict:
|
| 32 |
+
"""Extract industry, region, keywords, and report_type from the query using LLM (no function calling)."""
|
| 33 |
# Expanded lists for industries and regions
|
| 34 |
industries = [
|
| 35 |
"fintech",
|
|
|
|
| 86 |
industries_str = ", ".join(industries)
|
| 87 |
regions_str = ", ".join(regions)
|
| 88 |
prompt = f"""
|
| 89 |
+
Extract the following information from the user query below and return ONLY a valid JSON object with keys: industry, region, keywords, report_type.
|
| 90 |
- industry: The industry mentioned or implied. Choose from: {industries_str} (or specify if different).
|
| 91 |
- region: The region or country explicitly mentioned. Choose from: {regions_str} (or specify if different).
|
| 92 |
- keywords: The most important regulatory topics or terms, separated by commas. Do NOT include generic words or verbs.
|
| 93 |
+
- report_type: Only set to 'quick' if the user is asking for a highly specific fact, date, number, or detail (e.g., "When did the GDPR take effect?", "What is the fine for X?"). For general regulatory questions (even if phrased as 'what are', 'what is', etc.), or if the user asks for a full report or the question is vague, set to 'full'. Use 'summary' only if the user explicitly asks for a summary.
|
| 94 |
|
| 95 |
User query: {message}
|
| 96 |
|
| 97 |
+
Example outputs:
|
| 98 |
+
{{"industry": "AI", "region": "EU", "keywords": "AI Act, data privacy", "report_type": "summary"}}
|
| 99 |
+
{{"industry": "fintech", "region": "US", "keywords": "SEC regulations", "report_type": "quick"}}
|
| 100 |
+
{{"industry": "healthcare", "region": "Global", "keywords": "HIPAA, patient data", "report_type": "full"}}
|
| 101 |
"""
|
| 102 |
response = call_llm(prompt)
|
| 103 |
try:
|
| 104 |
params = json.loads(response)
|
| 105 |
except Exception:
|
| 106 |
+
# fallback: use heuristics for report_type
|
| 107 |
+
msg_lower = message.lower()
|
| 108 |
+
if any(
|
| 109 |
+
word in msg_lower for word in ["summary", "summarize", "short summary"]
|
| 110 |
+
):
|
| 111 |
+
report_type = "summary"
|
| 112 |
+
elif any(
|
| 113 |
+
word in msg_lower for word in ["report", "full report", "comprehensive"]
|
| 114 |
+
):
|
| 115 |
+
report_type = "full"
|
| 116 |
+
elif any(
|
| 117 |
+
word in msg_lower
|
| 118 |
+
for word in [
|
| 119 |
+
"when is",
|
| 120 |
+
"who is",
|
| 121 |
+
"how much",
|
| 122 |
+
"how many",
|
| 123 |
+
"specific",
|
| 124 |
+
"exact",
|
| 125 |
+
"detail",
|
| 126 |
+
"quick",
|
| 127 |
+
"brief",
|
| 128 |
+
"answer",
|
| 129 |
+
"fact",
|
| 130 |
+
"date",
|
| 131 |
+
"number",
|
| 132 |
+
"tell me more",
|
| 133 |
+
"give me more",
|
| 134 |
+
"more details",
|
| 135 |
+
"more info",
|
| 136 |
+
"expand on",
|
| 137 |
+
"elaborate on",
|
| 138 |
+
]
|
| 139 |
+
):
|
| 140 |
+
report_type = "quick"
|
| 141 |
+
else:
|
| 142 |
+
report_type = "full"
|
| 143 |
+
params = {
|
| 144 |
+
"industry": "General",
|
| 145 |
+
"region": "US",
|
| 146 |
+
"keywords": "",
|
| 147 |
+
"report_type": report_type,
|
| 148 |
+
}
|
| 149 |
+
# Ensure report_type is always present and valid
|
| 150 |
+
if params.get("report_type") not in ["quick", "summary", "full"]:
|
| 151 |
+
params["report_type"] = "full"
|
| 152 |
return params
|
| 153 |
|
| 154 |
+
def format_parameter_extraction(self, params: dict) -> str:
|
| 155 |
+
"""Format the parameter extraction display, including report type."""
|
| 156 |
+
return (
|
| 157 |
+
f"Industry: {params.get('industry', 'N/A')}\n"
|
| 158 |
+
f"Region: {params.get('region', 'N/A')}\n"
|
| 159 |
+
f"Keywords: {params.get('keywords', 'N/A')}\n"
|
| 160 |
+
f"Report Type: {params.get('report_type', 'full').capitalize()}"
|
| 161 |
+
)
|
| 162 |
+
|
| 163 |
def is_regulatory_query(self, message: str) -> bool:
|
| 164 |
+
"""Detect if this is a new regulatory, compliance, or update-related question (not a follow-up or general question).
|
| 165 |
+
Returns True only if the message is a new regulatory/compliance/update question. Returns False for follow-up regulatory or general questions.
|
| 166 |
+
"""
|
| 167 |
intent_prompt = f"""
|
| 168 |
+
Is the following user message a new regulatory, compliance, or update-related question? Respond 'yes' ONLY if the user is asking a new regulatory, compliance, or update-related question, not a follow-up or general question. If the message is a follow-up to a previous regulatory discussion (e.g., 'Can you expand on that?', 'What about healthcare?'), or a general/non-regulatory question, respond 'no'.
|
| 169 |
+
|
| 170 |
Message: {message}
|
| 171 |
Respond with only 'yes' or 'no'.
|
| 172 |
"""
|
| 173 |
|
| 174 |
intent = call_llm(intent_prompt).strip().lower()
|
| 175 |
+
return intent.startswith("y")
|
| 176 |
|
| 177 |
def process_regulatory_query(
|
| 178 |
self, message: str, params: dict = None, user_id: str = "user"
|
|
|
|
| 198 |
"params": params,
|
| 199 |
"crawl_results": crawl_results,
|
| 200 |
"memory_results": memory_results,
|
| 201 |
+
"report_type": params.get("report_type", "full"),
|
| 202 |
}
|
| 203 |
|
| 204 |
def generate_report(self, params, crawl_results, memory_results=None):
|
| 205 |
+
"""Generate a regulatory report (quick, summary, or full) including memory context if available"""
|
| 206 |
+
report_type = params.get("report_type", "full")
|
| 207 |
memory_context = ""
|
| 208 |
if memory_results:
|
| 209 |
# Format memory results for inclusion in the prompt (limit to 3 for brevity)
|
|
|
|
| 227 |
by_source[source] = []
|
| 228 |
by_source[source].append(result)
|
| 229 |
|
| 230 |
+
if report_type == "quick":
|
| 231 |
+
summary_prompt = f"""
|
| 232 |
+
Provide a very brief (1-2 sentences) answer with the most important regulatory update for {params["industry"]} in {params["region"]} (keywords: {params["keywords"]}).
|
| 233 |
+
{memory_context}
|
| 234 |
+
Data:
|
| 235 |
+
{json.dumps(by_source, indent=2)}
|
| 236 |
+
"""
|
| 237 |
+
elif report_type == "summary":
|
| 238 |
+
summary_prompt = f"""
|
| 239 |
+
Provide a concise summary (1 short paragraph) of the most important regulatory updates for {params["industry"]} in {params["region"]} (keywords: {params["keywords"]}).
|
| 240 |
+
{memory_context}
|
| 241 |
+
Data:
|
| 242 |
+
{json.dumps(by_source, indent=2)}
|
| 243 |
+
"""
|
| 244 |
+
else: # full
|
| 245 |
+
summary_prompt = f"""
|
| 246 |
+
Create a comprehensive regulatory compliance report for {params["industry"]} industry in {params["region"]} region.
|
| 247 |
+
{memory_context}
|
| 248 |
+
Analyze these regulatory updates:
|
| 249 |
+
{json.dumps(by_source, indent=2)}
|
| 250 |
+
|
| 251 |
+
Include:
|
| 252 |
+
|
| 253 |
+
---
|
| 254 |
+
|
| 255 |
+
## ποΈ Executive Summary
|
| 256 |
+
(2-3 sentences overview)
|
| 257 |
+
|
| 258 |
+
## π Key Findings
|
| 259 |
+
β’ Finding 1
|
| 260 |
+
β’ Finding 2
|
| 261 |
+
β’ Finding 3
|
| 262 |
+
|
| 263 |
+
## π‘οΈ Compliance Requirements
|
| 264 |
+
- List main requirements with priorities
|
| 265 |
+
|
| 266 |
+
## β
Action Items
|
| 267 |
+
- Specific actions with suggested timelines
|
| 268 |
+
|
| 269 |
+
## π Resources
|
| 270 |
+
- Links and references
|
| 271 |
+
|
| 272 |
+
Use emojis, bullet points, and clear formatting. Keep it professional but readable.
|
| 273 |
+
"""
|
| 274 |
|
| 275 |
return stream_llm(summary_prompt)
|
agents/ui_handler.py
CHANGED
|
@@ -44,7 +44,7 @@ class UIHandler:
|
|
| 44 |
)
|
| 45 |
|
| 46 |
def _handle_general_chat(self, message, history, user_id_state):
|
| 47 |
-
"""Handle general (non-regulatory) chat flow."""
|
| 48 |
history.append(
|
| 49 |
ChatMessage(role="assistant", content="π¬ Processing general query...")
|
| 50 |
)
|
|
@@ -55,7 +55,32 @@ class UIHandler:
|
|
| 55 |
streaming_content = ""
|
| 56 |
history.append(ChatMessage(role="assistant", content=""))
|
| 57 |
|
| 58 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 59 |
streaming_content += chunk
|
| 60 |
history[-1] = ChatMessage(role="assistant", content=streaming_content)
|
| 61 |
yield history, "", gr.update(interactive=False), user_id_state
|
|
@@ -88,7 +113,7 @@ class UIHandler:
|
|
| 88 |
|
| 89 |
# Clear status and show parameter extraction (collapsible)
|
| 90 |
history.pop()
|
| 91 |
-
param_msg =
|
| 92 |
history.append(
|
| 93 |
ChatMessage(
|
| 94 |
role="assistant",
|
|
|
|
| 44 |
)
|
| 45 |
|
| 46 |
def _handle_general_chat(self, message, history, user_id_state):
|
| 47 |
+
"""Handle general (non-regulatory) chat flow with context from conversation history."""
|
| 48 |
history.append(
|
| 49 |
ChatMessage(role="assistant", content="π¬ Processing general query...")
|
| 50 |
)
|
|
|
|
| 55 |
streaming_content = ""
|
| 56 |
history.append(ChatMessage(role="assistant", content=""))
|
| 57 |
|
| 58 |
+
# Gather last 5 user/assistant messages as context
|
| 59 |
+
context_msgs = []
|
| 60 |
+
for msg in history[-10:]:
|
| 61 |
+
if isinstance(msg, dict):
|
| 62 |
+
role = msg.get("role")
|
| 63 |
+
content = msg.get("content")
|
| 64 |
+
else:
|
| 65 |
+
role = getattr(msg, "role", None)
|
| 66 |
+
content = getattr(msg, "content", None)
|
| 67 |
+
if role in ("user", "assistant"):
|
| 68 |
+
context_msgs.append(f"{role.capitalize()}: {content}")
|
| 69 |
+
context_str = "\n".join(context_msgs[-5:])
|
| 70 |
+
|
| 71 |
+
# Compose prompt with context
|
| 72 |
+
if context_str:
|
| 73 |
+
prompt = f"""
|
| 74 |
+
You are an expert assistant. Here is the recent conversation history:
|
| 75 |
+
{context_str}
|
| 76 |
+
|
| 77 |
+
Now answer the user's latest message:
|
| 78 |
+
{message}
|
| 79 |
+
"""
|
| 80 |
+
else:
|
| 81 |
+
prompt = message
|
| 82 |
+
|
| 83 |
+
for chunk in stream_llm(prompt):
|
| 84 |
streaming_content += chunk
|
| 85 |
history[-1] = ChatMessage(role="assistant", content=streaming_content)
|
| 86 |
yield history, "", gr.update(interactive=False), user_id_state
|
|
|
|
| 113 |
|
| 114 |
# Clear status and show parameter extraction (collapsible)
|
| 115 |
history.pop()
|
| 116 |
+
param_msg = self.agent.format_parameter_extraction(params)
|
| 117 |
history.append(
|
| 118 |
ChatMessage(
|
| 119 |
role="assistant",
|
app.py
CHANGED
|
@@ -3,6 +3,10 @@ RegRadar - AI Regulatory Compliance Assistant
|
|
| 3 |
|
| 4 |
This application monitors and analyzes regulatory updates, providing
|
| 5 |
compliance guidance for various industries and regions.
|
|
|
|
|
|
|
|
|
|
|
|
|
| 6 |
"""
|
| 7 |
|
| 8 |
import warnings
|
|
@@ -74,12 +78,15 @@ def create_demo():
|
|
| 74 |
with gr.Accordion("π οΈ Available Tools", open=False):
|
| 75 |
gr.Markdown("""
|
| 76 |
**π§ Query Type Detection**
|
|
|
|
|
|
|
| 77 |
- Automatically detects if your message is a regulatory compliance query or a general question
|
| 78 |
- Selects the appropriate tools and response style based on your intent
|
| 79 |
|
| 80 |
**π© Information Extraction**
|
| 81 |
-
- Extracts key details (industry, region, keywords) from your command
|
| 82 |
-
-
|
|
|
|
| 83 |
|
| 84 |
**π Regulatory Web Crawler**
|
| 85 |
- Crawls official regulatory websites (SEC, FDA, FTC, etc.)
|
|
|
|
| 3 |
|
| 4 |
This application monitors and analyzes regulatory updates, providing
|
| 5 |
compliance guidance for various industries and regions.
|
| 6 |
+
|
| 7 |
+
New Feature: Improved Regulatory Query Detection
|
| 8 |
+
- Only new regulatory/compliance/update questions are treated as regulatory.
|
| 9 |
+
- Follow-up or general questions are handled as general chat, not as regulatory queries.
|
| 10 |
"""
|
| 11 |
|
| 12 |
import warnings
|
|
|
|
| 78 |
with gr.Accordion("π οΈ Available Tools", open=False):
|
| 79 |
gr.Markdown("""
|
| 80 |
**π§ Query Type Detection**
|
| 81 |
+
- Now distinguishes between new regulatory/compliance/update questions and follow-up or general questions.
|
| 82 |
+
- Only new regulatory questions trigger compliance workflows; follow-ups and general queries are handled as general chat.
|
| 83 |
- Automatically detects if your message is a regulatory compliance query or a general question
|
| 84 |
- Selects the appropriate tools and response style based on your intent
|
| 85 |
|
| 86 |
**π© Information Extraction**
|
| 87 |
+
- Extracts key details (industry, region, keywords, and report type) from your command
|
| 88 |
+
- Determines if you want a quick answer, summary, or full report, and adapts the response accordingly
|
| 89 |
+
- Shows the detected report type in the parameter extraction step
|
| 90 |
|
| 91 |
**π Regulatory Web Crawler**
|
| 92 |
- Crawls official regulatory websites (SEC, FDA, FTC, etc.)
|