Nmap API
Uses python3.10, Debian, python-Nmap, and flask framework to create a Nmap API that can do scans with a good speed online and is easy to deploy. The API also includes GPT3 functionality for AI generated reports. This is an implementation for our college PCL project which is still under development and constantly updating.
API Reference
Get all items
GET /api/p1/{auth_key}/{target}
GET /api/p2/{auth_key}/{target}
GET /api/p3/{auth_key}/{target}
GET /api/p4/{auth_key}/{target}
GET /api/p5/{auth_key}/{target}
Parameter | Type | Description |
---|---|---|
auth_key |
string |
Required. The API auth key gebe |
target |
string |
Required. The target Hostname and IP |
Get item
GET /api/p1/
GET /api/p2/
GET /api/p3/
GET /api/p4/
GET /api/p5/
Parameter | Return data | Description | Nmap Command |
---|---|---|---|
p1 |
json |
Effective Scan | -Pn -sV -T4 -O -F |
p2 |
json |
Simple Scan | -Pn -T4 -A -v |
p3 |
json |
Low Power Scan | -Pn -sS -sU -T4 -A -v |
p4 |
json |
Partial Intense Scan | -Pn -p- -T4 -A -v |
p5 |
json |
Complete Intense Scan | -Pn -sS -sU -T4 -A -PE -PP -PY -g 53 --script=vuln |
Auth and User management
Registration
GET /register
Payload:
Parameter | Type | Description |
---|---|---|
JSON_Payload |
JSON |
PARAMETERS |
The parameters should look like this while registering a new user:
{
"user_id": 1,
"username": "tim",
"role": "user",
"priority": "low"
}
for a new admin user the role must be changes to admin.
The current default admin user has a key of 60e709884276ce6096d1
Remove user
POST /rmuser/<int:id>/<string:username>/<string:key>
Parameter | Type | Description |
---|---|---|
ID |
int |
users id |
username |
string |
username assigned |
key |
string |
admin key |
for this function to work we need to add the admin key
and it wont work if the key is not correct.
Get users
GET /getuser/<string:admin_key>
Parameter | Type | Description |
---|---|---|
admin_key |
string |
the admin key |
This is for varifying the users and for implementing the front end section of the code which will be implemented in the future.
Improvements
Added GPT functionality with chunking module.
The methodology is based on how Langchain GPT embeddings
operate. Basically, the operation goes like this:
Loop Method
Data -> Chunks_generator ─┐ ┌─> AI_Loop -> Data_Extraction -> Return_Data
(GPT3 - 1500 TOKENS) ├─> Chunk1 ─┤
(GPT4 - 3000 TOKENS) ├─> Chunk2 ─┤
├─> Chunk3 ─┤
└─> Chunk N ─┘
Embeddings Method
Data -> Chunks_generator ─┐ ┌─> Vector Database ─> AI_Loop -> Data_Extraction -> Return_Data
(GPT3 - 1500 TOKENS) ├─> Chunk1 ─> Embeddings ─┤
(GPT4 - 3000 TOKENS) ├─> Chunk2 ─> Embeddings ─┤
├─> Chunk3 ─> Embeddings ─┤
└─> ChunkN ─> Embeddings ─┘
this is how to works:
- Step 1:
- The JSON is done scanning or the text is extracted and converted into a string
- Step 2:
- The long string is converted into individual tokens of words and characters for example
[]{};word
=='[',']','{','}',';','word'
- The long string is converted into individual tokens of words and characters for example
- Step 3:
- The long list of tokens is divided into groups of lists according to how many
tokens
we want. - for our use case we have a prompt and the data extracted and for simplicity, we went with the chunks of
500 tokens
+ the prompt tokens.
- The long list of tokens is divided into groups of lists according to how many
- Step 4:
- Step 4 can be achieved in 3 ways
a) Langchain
,b) OpenAI functions Feature
,c) The OpenAI API calls
- From our tests, the first option
Langchain LLM
did not work as it is not built for such processes - The second option
OpenAI functions feature
needed support and more context. - The Third was the best as we can provide the rules and output format for it to give an output.
- Step 4 can be achieved in 3 ways
- Step 5:
- The final step is to run the loop and
regex
the output data and return them as an output. - The reason for using regex is that
AI is unpredictable
so we need to take measures to keep our data usable. - The prompt is used as an output format making sure the AI gives that output no matter what so we can easily regex that output.
- The final step is to run the loop and
AI code:
def AI(analyze: str) -> dict[str, any]:
# Prompt about what the query is all about
prompt = f"""
Do a vulnerability analysis report on the following JSON data and
follow the following rules:
1) Calculate the criticality score.
2) Return all the open ports within the open_ports list.
3) Return all the closed ports within the closed_ports list.
4) Return all the filtered ports within the filtered_ports list.
output format: {{
"open_ports": [],
"closed_ports": [],
"filtered_ports": [],
"criticality_score": ""
}}
data = {analize}
"""
try:
# A structure for the request
completion = openai.Completion.create(
engine=model_engine,
prompt=prompt,
max_tokens=1024,
n=1,
stop=None,
)
response = completion.choices[0].text
# Assuming extract_ai_output returns a dictionary
extracted_data = extract_ai_output(response)
except KeyboardInterrupt:
print("Bye")
quit()
# Store outputs in a dictionary
ai_output = {
"open_ports": extracted_data.get("open_ports"),
"closed_ports": extracted_data.get("closed_ports"),
"filtered_ports": extracted_data.get("filtered_ports"),
"criticality_score": extracted_data.get("criticality_score")
}
return ai_output
The Prompt, Regex and extraction:
prompt = f"""
Do a vulnerability analysis report on the following JSON data provided.
It's the data extracted from my network scanner.
follow the following rules for analysis:
1) Calculate the criticality score based on the service or CVE.
2) Return all the open ports within the open_ports list.
3) Return all the closed ports within the closed_ports list.
4) Return all the filtered ports within the filtered_ports list.
6) Keep the highest possible accuracy.
7) Do not provide unwanted explanations.
8) Only provide details in the output_format provided.
output_format: {{
"open_ports": [],
"closed_ports": [],
"filtered_ports": [],
"criticality_score": ""
}}
data = {analize}
"""
The above-mentioned prompt as a distinct output format will return this output no matter the instance. These are the following things needed to be addressed:
- The prompt must be detailed.
- The prompt must explain all sorts of use cases and inputs.
- The prompt must be guided with rules to follow.
- The number of tokens must be monitored and taken care of.
This is the regex for it:
def extract_ai_output(ai_output: str) -> dict[str, Any]:
result = {
"open_ports": [],
"closed_ports": [],
"filtered_ports": [],
"criticality_score": ""
}
# Match and extract ports
open_ports_match = re.search(r'"open_ports": \[([^\]]*)\]', ai_output)
closed_ports_match = re.search(r'"closed_ports": \[([^\]]*)\]', ai_output)
filtered_ports_match = re.search(
r'"filtered_ports": \[([^\]]*)\]', ai_output)
# If found, convert string of ports to list
if open_ports_match:
result["open_ports"] = list(
map(cast(Callable[[Any], str], int),
open_ports_match.group(1).split(',')))
if closed_ports_match:
result["closed_ports"] = list(
map(cast(Callable[[Any], str], int),
closed_ports_match.group(1).split(',')))
if filtered_ports_match:
result["filtered_ports"] = list(
map(cast(Callable[[Any], str], int),
filtered_ports_match.group(1).split(',')))
# Match and extract criticality score
criticality_score_match = re.search(
r'"criticality_score": "([^"]*)"', ai_output)
if criticality_score_match:
result["criticality_score"] = criticality_score_match.group(1)
return result
The regex makes sure all the data is extracted and returned properly within the proper type we wanted. This also helps with the data management and removal of unwanted information.
API Key must be mentioned
openai.api_key = '__API__KEY__'
Package
The package is a simple extension for future usage or upgrades it can be installed by running:
cd package && pip install .
The Usage can be implemented like this:
from nmap_api import app
app.openai.api_key = '__API__KEY__'
app.start_api()
Deploy
For deploying the code there are 2 ways we can do that once the API is updated:
Method 1: Docker Instance
The docker instance can be built using the provided dockerfile
docker build -t <name> .
To run the docker instance you can run this:
docker run -p 443:443 <name>
It's as simple as it is no complications involved.
Method 2: Server Deploy
For the server deploying you need first to download the repo to the server and run the following:
- Step 1: Edit The nmap service file
- You can change the
WorkinDirectory
andgunicorn
paths to the paths you have set. - I suggest the rest of it stay as it is to avoid unwanted errors.
- You can change the
[Unit]
Description=Nmap API deployment
After=network.target
[Service]
User=root
WorkingDirectory=/
ExecStart=/usr/local/bin/gunicorn -w 4 -b 0.0.0.0:443 --timeout 2400 --max-requests 0 wsgi:app
Restart=always
[Install]
WantedBy=multi-user.target
- Step 2: Starting services
- We are good to go
mv nmapapi.service /etc/systemd/system
sudo systemctl daemon-reload
sudo systemctl start nmapapi
sudo systemctl enable nmapapi
- Step 4: I guess the final step changes per individual it is suggested to setup firewall rules and redirect port 80 to 443
Default User Keys
Default_Admin_Key: 60e709884276ce6096d1