Skip to main content

How to Audit Ethereum Wallet Activity using Web3.py

Updated on
Dec 27, 2024

15 min read

Overview​

Auditing firms frequently contact our Enterprise & Professional Services team seeking assistance retrieving blockchain activity linked to their clients' wallet addresses. This type of use case is usually required for regular fillings during tax season or triggered by a notice of action or fine that their clients have received from regulatory agencies. Accessing this blockchain activity has traditionally been a challenging task, but we're here to simplify it in this step-by-step guide. We'll walk you through how to conduct a thorough audit on any wallet address on Ethereum using Python.

If you have any questions about the steps in this guide or are an auditing firm looking for assistance in retrieving your clients' blockchain activity, QuickNode is happy to help! Reach out to victor@quicknode.com, and we would love to talk to you.

Now, let's get to auditing!

What You Will Do​

In this guide, you will fetch all transaction activity associated with a wallet, including:


  • Transaction history
  • Token transfer history (ERC20)
  • Internal transaction history

What You Will Need​

Before you begin, make sure you have the following:


DependencyVersion
web3.py5.30.0

Configuring Your Script​

Before writing the functions that will perform the auditing for us, we need to ensure we import the required Python packages into our environment. Create a Python file and include the following code at the top of your file:

from web3 import Web3 # we will be using Web3.py library for this guide
import json # we will need this to parse through your blockchain node's responses
from tqdm import tqdm # this library helps us track the progress of our script

In order to query data from the blockchain, you'll need API endpoints to communicate with the Ethereum network. For this, make sure you create a free QuickNode account here, and once logged in, click the Create an endpoint button, then select the Ethereum chain and Mainnet network.

After creating your endpoint, copy the HTTP Provider link and add it to your script:

# Configuring Ethereum endpoint
w3 = Web3(Web3.HTTPProvider("https://{your-endpoint-name}.quiknode.pro/{your-token}/"))

Fetch Wallet Transactions​

In order to identify wallet transaction activity, we will start by parsing through each block on the network, then parse through each transaction, and investigate if the address of interest is found in either the from or to fields of a transaction. The main function below will do just that:

# Main function to fetch wallet activity across a range of blocks
def get_transactions_for_addresses(addresses, from_block, to_block):
transactions = []

# Calculate the total number of blocks to process
total_blocks = to_block - from_block + 1

with tqdm(total=total_blocks, desc="Processing Blocks") as pbar:
for block_num in range(from_block, to_block + 1):
# Request block data
block = w3.eth.getBlock(block_num, full_transactions=True)

# Identify block transactions where address of interest is found
for tx in block.transactions:
if tx["from"] in addresses or tx["to"] in addresses:
tx_details = {
"block": block_num,
"hash": tx.hash.hex(),
"from": tx["from"],
"to": tx["to"],
"value": tx["value"],
"gas": tx["gas"],
"gasPrice": tx["gasPrice"],
"input": tx["input"],
}

transactions.append(tx_details)

# Update the progress bar
pbar.update(1)

return transactions

Fetch Token Transfers​

Notice how that main function fails to do two things: capture token transfers and identify internal transactions. Let’s go ahead and create a get_token_transfers function that will capture ERC20 token transfers. You will notice that in order to achieve that, we will need to parse through the logs of a specific block and investigate the topics that match our criteria. More specifically, we will look for ERC20 transfer event signature where our wallet address was either the sender or receiver of the transfer.

# Function to fetch wallet token transfers 
def get_token_transfers(address, block_num):
# Convert the address to its 32 bytes representation
padded_address = address.lower().replace("0x", "0x" + "0" * 24)

# Convert block_num to hexadecimal string
block_hex = hex(block_num)

filter_params = {
"fromBlock": block_hex,
"toBlock": block_hex,
"topics": [
"0xddf252ad1be2c89b69c2b068fc378daa952ba7f163c4a11628f55a4df523b3ef", # ERC20 Transfer event signature
None,
None
]
}

# Look for the address as the sender
filter_params["topics"][1] = padded_address
sent_transfers = w3.eth.get_logs(filter_params)

# Parse through API response and record transfer details
sent_transfers_list = []
for entry in sent_transfers:
modified_entry = {
"contractAddress": entry["address"],
"from": w3.toChecksumAddress(entry["topics"][1].hex().lstrip("0x")),
"to": w3.toChecksumAddress(entry["topics"][2].hex().lstrip("0x")),
"value": int(entry["data"],16),
"topics": [topic.hex() if isinstance(topic, bytes) else topic for topic in entry["topics"]],
"data": entry["data"],
"blockNumber": entry["blockNumber"],
"logIndex": entry["logIndex"],
"transactionIndex": entry["transactionIndex"],
"transactionHash": entry["transactionHash"].hex(),
"blockHash": entry["blockHash"].hex(),
"removed": entry["removed"]
}
sent_transfers_list.append(modified_entry)

# Look for the address as the receiver
filter_params["topics"][1] = None
filter_params["topics"][2] = padded_address
received_transfers = w3.eth.getLogs(filter_params)

# Parse through API response and record transfer details
received_transfers_list = []
for entry in received_transfers:
modified_entry = {
"contractAddress": entry["address"],
"from": w3.toChecksumAddress(entry["topics"][1].hex().lstrip("0x")),
"to": w3.toChecksumAddress(entry["topics"][2].hex().lstrip("0x")),
"value": int(entry["data"],16),
"topics": [topic.hex() if isinstance(topic, bytes) else topic for topic in entry["topics"]],
"data": entry["data"],
"blockNumber": entry["blockNumber"],
"logIndex": entry["logIndex"],
"transactionIndex": entry["transactionIndex"],
"transactionHash": entry["transactionHash"].hex(),
"blockHash": entry["blockHash"].hex(),
"removed": entry["removed"]
}
received_transfers_list.append(modified_entry)

return sent_transfers_list + received_transfers_list

Now, all we need to do is update our get_transactions_for_addresses function to call the get_token_transfers function:

# Main function to fetch wallet activity across a range of blocks
def get_transactions_for_addresses(addresses, from_block, to_block):
transactions = []

# Calculate the total number of blocks to process
total_blocks = to_block - from_block + 1

with tqdm(total=total_blocks, desc="Processing Blocks") as pbar:
for block_num in range(from_block, to_block + 1):
# Request block data
block = w3.eth.getBlock(block_num, full_transactions=True)

# Identify block transactions where address of interest is found
for tx in block.transactions:
if tx["from"] in addresses or tx["to"] in addresses:
tx_details = {
"block": block_num,
"hash": tx.hash.hex(),
"from": tx["from"],
"to": tx["to"],
"value": tx["value"],
"gas": tx["gas"],
"gasPrice": tx["gasPrice"],
"input": tx["input"],
"token_transfers": [],
}

# Fetch token transfers
tx_details["token_transfers"].extend(get_token_transfers(tx["from"], block_num))
tx_details["token_transfers"].extend(get_token_transfers(tx["to"], block_num))

transactions.append(tx_details)

# Update the progress bar
pbar.update(1)

return transactions

Fetch Internal Transactions​

As we mentioned in the previous section, the main function also missed capturing internal transactions. Internal transaction occurs when an EOA (wallet) interacts with a smart contract, and the metadata details of that interaction are only available by investigating traces of the transaction. So we will be creating a get_internal_transactions function that will run debug_traceTransaction and fetch those details when applicable. The Trace/Debug API is exclusive to the Build plan and above. For more information on the plans and their features, visit the QuickNode pricing page.

# Function to fetch wallet internal transactions
def get_internal_transactions(tx_hash):
try:
# Making request
trace = w3.provider.make_request("debug_traceTransaction", [tx_hash, {"tracer": "callTracer"}])

internal_txs = []
if "result" in trace:
internal_txs.append(trace["result"]["calls"])
return internal_txs

except Exception as e:
return str(e)

Now, all we need to do is update our get_transactions_for_addresses function to call the get_internal_transactions function:

# Main function to fetch wallet activity across a range of blocks
def get_transactions_for_addresses(addresses, from_block, to_block):
transactions = []

# Calculate the total number of blocks to process
total_blocks = to_block - from_block + 1

with tqdm(total=total_blocks, desc="Processing Blocks") as pbar:
for block_num in range(from_block, to_block + 1):
# Request block data
block = w3.eth.getBlock(block_num, full_transactions=True)

# Identify block transactions where address of interest is found
for tx in block.transactions:
if tx["from"] in addresses or tx["to"] in addresses:
tx_details = {
"block": block_num,
"hash": tx.hash.hex(),
"from": tx["from"],
"to": tx["to"],
"value": tx["value"],
"gas": tx["gas"],
"gasPrice": tx["gasPrice"],
"input": tx["input"],
"token_transfers": [],
"internal_transactions": []
}

# Fetch token transfers
tx_details["token_transfers"].extend(get_token_transfers(tx["from"], block_num))
tx_details["token_transfers"].extend(get_token_transfers(tx["to"], block_num))

# Check for interactions with contracts and get internal transactions
if tx["to"] and w3.eth.getCode(tx["to"]).hex() != "0x":
tx_details["internal_transactions"].extend(get_internal_transactions(tx.hash.hex()))

transactions.append(tx_details)

# Update the progress bar
pbar.update(1)

return transactions

Run Your Audit​

Now that we have those three functions built, let’s include the execution function to the script that will take all the parameters necessary to run.

# Execution function
def run(addresses, from_block, to_block):

transactions = get_transactions_for_addresses(addresses, from_block, to_block)

# Write output to the JSON file
output_file_path = "wallet_audit_data.json"
with open(output_file_path, "w") as json_file:
json.dump(transactions, json_file, indent=4)

All that is left to do is add the usage snippet below at the end of your script with the wallet address you want to audit and the block range of interest. Run the script using your CLI terminal (e.g. Python3 wallet_auditor.py) or inside VS Code (e.g. click the play button on the top-right of the editor). For the purpose of this demo, I am doxing my friend and colleague @bunsen:

# Usage example:
# run(["{wallet_address}"], fromBlock, toBlock)
run(["0x91b51c173a4bDAa1A60e234fC3f705A16D228740"],17881437, 17881437)

After execution, you will find a wallet_audit_data.json file in the same folder directory with the entire activity trail of the wallet, including transactions, token transfers, and internal transactions. Great work!

If you want to reference our code in its entirety, check out our GitHub page here.

Conclusion​

Congratulations, you are now able to perform a thorough audit on any wallet address on Ethereum!

This is the base knowledge to properly conduct a full audit on blockchain activity. This is how you would take this to the next level:


  • Fetch data for a collection of wallet addresses
  • Reduce expected runtime by utilizing multi-threading to run your script in parallel for chunks of block ranges
  • Identify token transfers beyond ERC20, such as ERC721 and ERC1155

To learn more about how QuickNode is helping auditing firms to pull this type of data from blockchains in a way that guarantees completeness and accuracy of the data, please feel free to reach out to me at victor@quicknode.com and I would love to talk to you!

We ❀️ Feedback!

Let us know if you have any feedback or requests for new topics. We'd love to hear from you.

Share this guide