Database Cleanup: Audit and Remove Unused 'sid' Field from Schedule Collection
Description
🔍 Database Cleanup Task: Audit and Remove Unused 'sid' Field
Objective: Investigate and clean up the 'sid' field in the schedule collection that appears to be inconsistently used across the application.
📋 Task Description
Perform a comprehensive audit of the 'sid' field usage in the schedule collection and remove it if it's not properly utilized throughout the application.
🎯 Task Goals
-
Database Analysis: Determine which schedules have 'sid' populated vs empty/null
-
Code Audit: Find all places in the application that reference or use the 'sid' field
-
Impact Assessment: Understand what functionality depends on 'sid'
-
Safe Removal: Remove the field if it's unused or inconsistently populated
-
Data Cleanup: Clean up any orphaned references or unused data
🔍 Investigation Steps
Phase 1: Database Analysis
-
[ ] Query schedule collection to count records with 'sid' populated
-
[ ] Identify patterns in 'sid' usage (which schedule types use it)
-
[ ] Check data consistency - are 'sid' values unique, properly formatted?
-
[ ] Analyze relationships - does 'sid' relate to other collections?
-
[ ] Document findings with statistics and examples
Phase 2: Codebase Audit
-
[ ] Search Flutter codebase for all references to 'sid' field
-
[ ] Check database queries that select, insert, or update 'sid'
-
[ ] Review API endpoints that expose or consume 'sid'
-
[ ] Identify business logic that depends on 'sid' values
-
[ ] Document all usage locations with file paths and line numbers
Phase 3: Impact Assessment
-
[ ] Test scenarios without 'sid' field to identify breaking changes
-
[ ] Review user workflows that might depend on 'sid'
-
[ ] Check external integrations that might expect 'sid'
-
[ ] Assess migration complexity if removal is needed
-
[ ] Document potential risks and mitigation strategies
Phase 4: Cleanup Implementation
-
[ ] Create migration script to remove 'sid' field from collection
-
[ ] Update code references to remove 'sid' usage
-
[ ] Test thoroughly to ensure no functionality breaks
-
[ ] Backup data before making changes
-
[ ] Execute cleanup in staging environment first
🗃️ Database Investigation Queries
Analysis Queries to Run:
// Count schedules with sid populated
db.schedules.countDocuments({sid: {$exists: true, $ne: null, $ne: ""}})
// Count schedules without sid
db.schedules.countDocuments({$or: [{sid: {$exists: false}}, {sid: null}, {sid: ""}]})
// Sample records with sid
db.schedules.find({sid: {$exists: true, $ne: null}}).limit(10)
// Check sid uniqueness
db.schedules.aggregate([
{$group: {_id: "$sid", count: {$sum: 1}}},
{$match: {count: {$gt: 1}}}
])
// Find relationships with other collections
db.otherCollection.find({scheduleId: {$in: [/* sid values */]}})
💻 Code Audit Areas
Files/Areas to Check:
-
[ ] Models/Schemas: Schedule model definitions
-
[ ] Database Queries: All CRUD operations on schedules
-
[ ] API Endpoints: REST/GraphQL endpoints using schedules
-
[ ] Frontend Components: UI that displays or uses schedule data
-
[ ] Business Logic: Services that process schedule information
-
[ ] Migration Scripts: Historical database migrations
Search Patterns:
-
sid
(field name) -
schedule.sid
(object property access) -
scheduleId
(potential alternative naming) -
Database query builders with 'sid' references
⚠️ Risk Assessment
Potential Risks:
-
Data Loss: Removing field that's actually needed somewhere
-
Breaking Changes: Code that silently depends on 'sid'
-
Integration Issues: External systems expecting 'sid'
-
Performance Impact: Queries optimized around 'sid' indexing
Mitigation Strategies:
-
Comprehensive testing before and after changes
-
Staged rollout (dev → staging → production)
-
Backup and rollback plan ready
-
Monitor error logs for 'sid' related issues post-cleanup
📊 Success Criteria
Completion Indicators:
-
✅ Complete audit of 'sid' usage documented
-
✅ Database analysis shows consistent data state
-
✅ Code references either removed or justified
-
✅ Testing completed with no functionality regressions
-
✅ Documentation updated to reflect changes
-
✅ Performance maintained or improved after cleanup
🎯 Expected Outcomes
If 'sid' is Unused:
-
Database size reduction from removing unused field
-
Code simplification by removing unnecessary references
-
Improved maintainability with cleaner data model
-
Better performance from reduced field overhead
If 'sid' is Used:
-
Documented usage patterns for future reference
-
Consistent implementation across all usage points
-
Proper validation and data integrity rules
-
Clear business logic around 'sid' functionality
🔄 Related Work
Connection to Recent Issues:
-
SMR-81 (resolved): Home page loop was caused by orphaned schedule references
-
SMR-125: Database performance issues with repeated schedule queries
-
General cleanup: Part of improving database performance and consistency
Technical Debt Reduction:
This task addresses potential technical debt in the data model and helps ensure database consistency, which can prevent future bugs like the ones recently resolved.
📅 Estimated Effort
Time Estimate: 2-3 days
-
Day 1: Database analysis and code audit
-
Day 2: Impact assessment and testing
-
Day 3: Cleanup implementation and verification
Complexity: Medium - requires careful analysis but straightforward implementation"