blocktorch
  • Overview
    • Introduction
    • Install/Setup
    • Quickstart
  • Concepts
    • Data Sources
      • EVM Chains
      • Roll-ups
      • Local forks
        • Setup Hardhat fork
        • Setup managed Hardhat Fork
        • Hardhat forking API
      • Smart Contracts
        • Adding smart contracts
      • Custom Event Data
      • Oracles
      • Account Abstraction modules
        • Navigating the AA Explorer
      • Decentralized Datastorage
      • React Frontends
    • Querying data
    • Telemetry
  • Use cases
    • Searching
      • Logs
        • Log Details
      • Sharing search results
    • Monitoring
      • Building monitors
      • Alerting
    • Tracing
      • Stack traces
      • End-2-End Traces
    • Dashboarding
      • Pre-made dashboards
      • Custom graphs
    • Collaborating
      • Inviting others
      • Sharing data
    • Debugging
      • Step debugger
    • Benchmarking
    • Managing incidents
      • Troubelshooting
      • Post mortem
    • Predicting
  • Ressources
    • Demo videos
    • FAQs
  • Contribute
    • Open Source projects
Powered by GitBook
On this page
  1. Use cases
  2. Managing incidents

Troubelshooting

PreviousManaging incidentsNextPost mortem

Last updated 1 year ago

At blocktorch we aim at being a reliable partner of our users throughout the whole software development lifecycle, and also equip engineers with the right tooling and data insights when troubleshooting. Below is an outline of steps that can be taken during the troubleshooting process and how blocktorch's toolkit can be of help during the process. Our mission is to help making web3 more reliable and secure for everyone involved.

1. Initial Assessment

  • Identify Symptoms: Determine if the issue is a downtime or a security breach.

    • Building the relevant in blocktorch can be crucial in identifying systems as fast as possible. Blocktorch ships some , which can help in identifying issues, but you know your software better than us, so building custom monitors is an important practice. From the charts you can directly navigate to the related logs by clicking the data points in the chart you need to investigate further

  • Scope of Impact: Assess the extent—how many services, users, or systems are affected.

    • Looking at the of the relevant logs can help figure out bottlenecks and affected services

    • We highly recommend also making use of to get deeper insights on client side issues

2. Communication

  • Notify Stakeholders: Inform the relevant team members and stakeholders about the issue.

    • When your team members and stakeholders are, they can receive monitor alerts proactively

  • External Communication: If necessary, prepare a communication plan for customers or the public.

    • Home dashboards as well as search queries are also with external stakeholders, so your community of users can get informed as well

3. Isolation

  • Isolate Affected Systems: To prevent further damage, isolate the compromised or malfunctioning components.

  • Limit Access: Restrict access to sensitive systems until the nature and scope of the issue are understood.

    • you can disable functionalities in your UI

    • if your smart contracts are built with the functionality to pause functions, you can think of doing so

4. Investigation

  • Identify Vulnerabilities: Look for any vulnerabilities or errors that might have led to the issue.

5. Mitigation

  • Patch and Update: Apply necessary patches or updates to software to mitigate the vulnerability or error.

    • We are aware that this can be especially hard when the root cause lies within the smart contract, unless your project utilizes upgradable contract architecture

  • Make aware your users: If a security breach is confirmed, prompt users to not sign any malicious smart contract interactions

6. Recovery

  • Restore Services: Gradually restore services, ensuring they are fully sanitized and secure.

7. Postmortem Analysis

  • Analyze Causes: Thoroughly document what happened, why it happened, and how it was resolved.

  • Review Processes: Evaluate and update security policies, response strategies, and monitoring techniques to prevent future incidents.

8. Ongoing Monitoring

  • Regular Audits: Schedule regular security audits to ensure ongoing compliance and security.

Review Logs: Check application, security, and system logs for anomalies or indicators of the cause by utilizing

Your smart contract code can be directly accessed in blocktorch's

Blocktorch's can help you find the exact line of code causing the vulnerability or error

If your application is using Oracles and you believe the root cause could be there, you can check the Oracle's

To test the services locally you can leverage blocktorch's

Continuous Monitoring: Implement additional custom real-time in blocktorch to detect future issues promptly

monitors
KPIs and visuals out of the box
stack traces and invocation flows
blocktorch's frontend Dragon SDK
part of your blocktorch project
shareable
blocktorch's search
step debugger
managed hardhat forks
monitoring
contract details page
out of the box details