Feature Overview
Integrated Tools for Comprehensive Data Analysis
AUDIPY is a versatile software tool for analysing, processing and visualising structured datasets. It supports you in efficiently searching, filtering and evaluating large volumes of data in a targeted way – without complicated processes or additional tools.
Thanks to its flexible architecture, complex analyses can be carried out just as easily as quick day-to-day evaluations. Below you will find an overview of the core functions that make AUDIPY a powerful tool for data analysis and visualisation.
Thanks to its flexible architecture, complex analyses can be carried out just as easily as quick day-to-day evaluations. Below you will find an overview of the core functions that make AUDIPY a powerful tool for data analysis and visualisation.
BASICS, FILE MANAGEMENT AND IMPORT
Supported File Formats
- Parquet – Highly optimised column format (standard format)
- Excel – XLSX, XLSB, XLS (all versions)
- CSV / TXT – Automatic character set and delimiter detection
- XML – Structured data imports
- JSON – Support for nested structures
- DBF – DBase files
- ORC – Apache ORC format
- Power BI – Direct extraction from PBIX files
Database Connectors
- Microsoft SQL Server – With Windows authentication
- MySQL & MariaDB – Full compatibility
- PostgreSQL – Native support
- SQLite – Embedded databases
- Microsoft Access – MDB/ACCDB via ODBC
- ODBC – Connection to any data source
- SSL/TLS encryption for secure connections
Import Wizards
- Data Wizard for CSV/TXT with preview and delimiter detection
- Skip headers or footers, character set selection (UTF-8, ISO, etc.)
- Excel Wizard with Smart Merge and automatic header detection
- Merging multiple worksheets
- Report Wizard for extraction from PDF files
- Database Wizard with SQL Query Builder
- Connection test and error diagnostics integrated
Special Imports
- GoBD-compliant XML import (Index.xml with references)
- SAP DART files – Automatic structure import
- XRechnung – Extraction and visualisation
- Power BI Integration – Access to DAX & M Queries
- Batch import – Read multiple files simultaneously
- Folder import – Recursive processing of entire directories
Explorer & Project Management
- Hierarchical file explorer with tree view
- Project and case management with quick switching
- Data lineage – Tracking of derivations
- Creation and management of custom folder structures
- Multiple files open simultaneously (tab system)
- Drag & Drop – Import files directly
DATA EXPLORATION AND INTERACTION
Column Formatting
- Automatic detection and conversion of date formats
- Support for Unix timestamps and Excel serial numbers
- Number formats – Integers, decimals, EU/US separators
- Thousands separators, percentage and currency display
- Formatting for text, Boolean and hyperlinks
- Configurable decimal places and display precision
Column and Row Actions
- Duplicate, rename, delete or hide columns
- Reorder via drag & drop with automatic saving
- Adjust column widths individually and save permanently
- Sort by one or more columns
- Copy cell contents – individually or in blocks
- Editable columns – Direct editing in the table
- Add or delete rows in edit mode
Advanced Filtering
- Simple filters – Equal, Not equal, Contains, Starts with, Ends with
- Range filters – Greater than, Less than, Between
- Date filters – Year, month, week, weekday or time periods
- Fuzzy filter – Similar values with configurable accuracy
- Multi-value filter – Select multiple values (In / Not In)
- AND/OR logic – Combine multiple filter conditions
- Filter bookmarks – Save, name and reuse
Column Statistics
- Sum, mean, median, minimum and maximum
- Standard deviation, variance and quantiles (25%, 50%, 75%)
- Unique values – Frequency and cardinality analysis
- Null values – Detection and percentage display
- Type-adaptive statistics – Automatic detection of text, number or date
- Inline display of statistical results in tables
External Actions & Integration
- Google Search – Start directly with cell content
- Google Translate – Translate selected content
- Google Maps – Interactive address search and map view
- XRechnung visualisation – Display structured invoice data
- Link to external Parquet files (Cross-File Lookup)
- Launch external applications from context menus
- Execute custom Python scripts or analyses
User Interface & Interaction
- Multilingual interface – German, English, Dutch
- Design themes – Light, Dark, Custom
- Zebra pattern – Alternating row colours for better readability
- Adjustable font size, DPI scaling for high-resolution displays
- Context menus with quick access to all table functions
- Responsive layouts for different screen sizes
DATA PROCESSING AND TRANSFORMATION
Stream Processing
- Processing very large datasets in partial sections
- Significantly reduced memory consumption thanks to streaming
- Stream extraction – Real-time filtering during import
- Stream grouping – Aggregation without memory overflow
- Stream join of large files
- Stream append – Merging many files
- Stream pivot – Creating large cross-tables in stream
- Based on DuckDB – Highest performance and efficiency
Data Joins
- Inner Join – Only matching records
- Left Join – All records from the left table
- Right Join – All records from the right table
- Full/Outer Join – All records from both tables
- Anti Join – Only non-matching records
- Fuzzy Join – Similarity-based matching
- Join across multiple key columns
- Append / Concatenate – Add tables row by row
Data Cleansing
- Find and automatically remove duplicates
- Delete empty rows – completely or partially
- Fill missing values (Forward/Backward Fill, Mean, Median)
- Outlier detection using statistical methods
- Data type validation – automatic correction
Column Transformation
- Merge – Combine columns with delimiters
- Split – By delimiter, length or pattern
- Extract – Numbers only, text only or by RegEx
- Replace – Substring or exact match
- Text operations – Trim, convert to upper/lower case
- Date operations – Extract year, month, day or time components
Aggregation & Pivot
- Group by one or more columns
- Standard aggregations – Sum, average, count, min, max
- Advanced aggregations – Median, variance, standard deviation
- Pivot tables – Flexible cross-tables with multiple values
- Drill-down – Interactive detail view in pivot results
- Custom calculations and groupings
Calculated Columns
- Formula editor with real-time preview
- Mathematical operators (+, -, *, /, %, power)
- Statistical functions – Mean, median, standard deviation
- Date/time differences – Calculate time intervals
- Lag/Lead – Reference to previous or next row
- If-Then functions – Arbitrarily nested conditions
- Cumulative sums with grouping
- Full support for Python expressions
New Columns
- Editable text columns for manual input
- Editable numeric columns for calculations
- Index column – Sequential numbering
- Audit column – Manual markings (✔, ✖ or empty)
- Constant column – Fixed values for all rows
Comparison & Difference Analysis
- Table comparison – Compare two datasets directly
- Row-by-row analysis – Added, removed, changed entries
- Column-by-column differences – Make field changes visible
- Diff report – Clear summary of changes
- Export of differences to Excel
AI FUNCTIONS (ASK MY DATA – AMY)
AI Chat Interface
- Ask my Data (AMY) – Ask questions in natural language
- Answers as text, table or interactive charts
- Automatic Python code generation for analyses
- Error correction and improvement suggestions for code
- Context preservation – Follow-up questions and multi-step dialogues
- Gatekeeper function: If imprecise, AMY asks for clarification
- Favourite prompts: Save prompts and execute via quick command
Supported AI Models
- AUDIPY AI – GDPR-compliant, fully data-privacy-secure
- OpenAI – Various models (optionally integrable)
- Deepseek – Various models
- GPT4All – Local models without internet connection
- Custom APIs – Connection to custom endpoints
- With AUDIPY AI: Only metadata (e.g. column names) is processed
- No sensitive data content leaves the system
Magic Prompts & Playbooks
- Magic Prompt Generator – Context-dependent analysis suggestions
- Consideration of industry, data content and system context
- Automatic data quality check – Missing values, duplicates
- Outlier and anomaly detection
- Distribution analysis for identifying data patterns
- Playbook creation for secure analysis with AI agents
AI-Assisted Column Creation
- AI Column Generator – Create new columns automatically
- Complex transformations without programming knowledge
- Extraction of text patterns via AI analysis
- Automated calculation of context-related values
AI Visualisation & Insights
- Automatic chart generation – Select the optimally fitting chart
- Insight generation – Important findings are highlighted
- Trend detection – Identification of developments in the dataset
- Anomaly explanation – AI provides explanations for unusual values
- Visual storyboards – Automatically generated analyses as a report
Documentation & Sharing
- AMY bookmarks – Save and re-execute AI queries
- Automatic documentation of all AI analyses performed
- View, edit and export Python code
- Save interactive charts as HTML or PNG
- Reproducible workflows – Analyses exactly repeatable
- Column description generation by AMY
DATA ANALYSIS AND VISUALISATION
Descriptive Statistics
- Unique values (all columns) – Identify primary keys
- Unique values (column) – Frequency analysis
- Missing values – Check data completeness
- Value distribution – Percentage and absolute frequencies
- Quartiles – 25th, 50th, 75th percentiles
- Summary – All key metrics at a glance
Outlier Analysis (Anomalies)
- Isolation Forest – AI-based outlier detection
- Z-Score and Modified Z-Score – Classic statistical methods
- Rolling Z-Score – Optimised for time series
- Automatic mode – Selects the best method based on data type
- Category-based analysis – Group-specific evaluation
- Interactive visualisation – Display with rolling median
Distribution Analysis
- Histogram with various bin methods (Sturges, Rice, Scott, Freedman-Diaconis)
- Kernel Density Estimation (KDE) – Smooth density distribution
- Box plot and violin plot – Compare quartiles and distribution
- Comparison with theoretical normal distribution
- Category-based overlays – Multiple groups in one view
Time Series Analysis
- Seasonal Decomposition – Breakdown into trend, seasonality and residuals
- Trend and pattern recognition over time
- Rolling averages – Moving averages
- Line charts and area charts for trend visualisation
- Seasonal comparison views across multiple periods
Benford Analysis (Digit Analysis)
- Forensic method for detecting potential manipulation
- First, second and combined digit testing
- Last digit check – Validate trailing digits
- Category-based Benford analysis – Grouped checks
- Visual representation of deviations from Benford distribution
Gap Analysis
- Check completeness of sequential number ranges
- Check invoice numbers for gaps and duplicates
- Grouped number ranges by prefix or category
- Reset fields for periodic numbers (e.g. annually)
- Timeline visualisation to display gaps
Advanced Visualisations
- Network diagrams – Relationships between entities
- Sankey / Money Flow – Representation of cash flows
- Parallel coordinates – Multidimensional data analysis
- Scatter plots – Correlations between variables
- Bar and line charts – Visualise shares and categories
Interactive Charts
- Plotly-based interactive visualisations
- Zoom, pan and selection functions directly in the chart
- Hover tooltips with detailed information
- Export as interactive HTML or static image (PNG)
AUDIT, COMPLIANCE AND ADVANCED FEATURES
Audit Steps & Reproducibility
- Audit Steps – Reusable, standardised audit procedures
- My Audit Manager – Management of audit workflows
- Define and save custom audit steps
- Python scripts for automated audits
- Automatic documentation of all actions
- Easy recording of audit steps at the push of a button
- Audit Step Editor: Build your audit routines with custom user dialogs
Monetary Unit Sampling (MUS)
- Sample planning using Poisson, Alpha-Beta and Hypergeometric methods
- Cumulative value selection – Systematic extraction
- Sample evaluation using various methods
- Tainting & extrapolation – Projection to the population
- Reproducible thanks to seed values
- PDF report for documentation
Bookmark System
- Analysis bookmarks – Save complete views
- Reuse filter, pivot and chart bookmarks
- Notes and descriptions for each entry
- Colour markings for categorisation
- Quick access via the dashboard
Data Integrity & Security
- Hash chain technology – Guarantee immutability
- SHA-256, MD5 and xxHash – Cryptographic checksums
- Signature verification – Ensure data provenance
- History report – Complete change history
- Metadata management – Store extended data information
- Encrypted API keys for secure integration
Export & Reporting
- Parquet, CSV and Excel export with formatting options
- PDF reports with integrated visualisations
- HTML export – Interactive reports as web pages
- Power BI export – Create PBIX files directly
- Column selection for targeted export
- "What you see is what you get" – Filters are carried over
Sampling Methods
- Random and systematic samples
- Stratified samples by category
- Value-proportional selection (MUS) – Risk-based
- Reproducible results through seed values
- Automatic calculation of sample size
Developer Mode & Advanced Features
- Metadata management – Read, delete, replace
- Python integration for individual automations
- Debug logging for detailed error analysis
- Feature toggles – Activate experimental functions
Compliance Support
- GoBD-compliant – Meets German accounting standards
- GDPR-compliant – Data protection fully guaranteed
- Audit-proof – Immutable documentation
- Complete audit trail for full traceability
- Privacy by Design – Data protection integrated from the start
PERFORMANCE & SCALABILITY
Big Data Handling
- Streaming operations – Process large datasets without fully loading them
- DuckDB integration – In-memory analytics for millions of rows
- Chunked processing – Processing in flexible blocks
- Intelligent memory management – Automatic memory cleanup
- Lazy loading – Load only currently needed data
- Column selection – Read only relevant columns
Asynchronous Processing
- Background processing – UI remains responsive during processing
- Multi-threading – Parallel execution of tasks
- Resource management – Automatic release of unused memory areas
Visualisation of Large Datasets
- Figure resampling – Automatic reduction of data points for high performance
- Zoom-based aggregation – Detailed view when zooming in
- Interactive downsampling – Smooth navigation without loss of detail
- Efficient rendering – Optimised display even with millions of points
- GPU-accelerated visualisation – Maximum performance at high resolution

