Top LinkedIn Content on Organizing Digital Files Efficiently

15,396 followers 1y

Did you know that analysts spend a whopping 80% of their time just finding and preparing data? 🤯 That's like spending most of your workday searching for your misplaced TV remote! Well, good news, data wranglers! 𝐓𝐡𝐞 𝐩𝐫𝐨𝐛𝐥𝐞𝐦: Managing data transformations in BigQuery can sometimes feel like navigating a maze blindfolded. Keeping track of different versions of your SQL queries, collaborating with your team without stepping on each other's digital toes, and ensuring everything is reproducible? You might have different scripts scattered everywhere, making it tough to understand who changed what, when, and why your perfectly crafted dashboard suddenly looks like abstract art. 𝐓𝐡𝐞 𝐒𝐨𝐥𝐮𝐭𝐢𝐨𝐧 𝐢𝐧 𝐇𝐚𝐧𝐝: 👉 Enter BigQuery Repositories! 🎉 Think of it as giving your BigQuery SQL a super organized closet with a built-in version control system, thanks to its shiny new integration with Git! Yes, the same Git that developers have been raving about for ages is now here to bring order to your data chaos. 👉 This means you can now connect your BigQuery projects directly to your favorite Git provider (like GitHub, GitLab, or Bitbucket). You can track changes, collaborate seamlessly with branches and pull requests, and even roll back to previous versions if your latest "brilliant" transformation turns out to be… less brilliant. 😉 𝐊𝐞𝐲 𝐓𝐚𝐤𝐞𝐚𝐰𝐚𝐲𝐬 𝐟𝐨𝐫 𝐈𝐧𝐝𝐢𝐯𝐢𝐝𝐮𝐚𝐥 𝐔𝐬𝐞𝐫𝐬: 👉 Say Goodbye to "Oops!" Moments: Accidentally deleted a crucial line of code? No sweat! Git's got your back. You can easily revert to previous versions & avoid that heart-stopping "Ctrl+Z" panic. ❤️🩹 👉 Collaborate Like a Pro: Working with a team? Now you can work on different parts of your data transformations simultaneously without overwriting each other's work. It's like having separate lanes on the data highway! 🛣️ 👉 Reproducibility Made Easy: Need to run the same analysis again with slight variations? Git makes it a breeze to manage different versions & ensure your results are consistent. 🧑🍳 𝐊𝐞𝐲 𝐓𝐚𝐤𝐞𝐚𝐰𝐚𝐲𝐬 𝐟𝐨𝐫 𝐎𝐫𝐠𝐚𝐧𝐢𝐳𝐚𝐭𝐢𝐨𝐧𝐬: 👉 Improved Data Governance & Auditability: Track every change made to your data transformations, providing a clear audit trail and ensuring better data governance. It's like having a digital paper trail for all your data magic. 📜 👉 Streamlined Development & Deployment: Implement CI/CD (Continuous Integration/Continuous Deployment) pipelines for your data transformations, making the development and deployment process smoother & more reliable. ✈️ BigQuery's integration with Git is a significant step towards bringing the best practices of software development to the world of data analytics. It's about making our lives as data professionals easier, more organized & dare I say, even a little bit more fun! 🎉 Follow Omkar Sawant for more. #BigQuery #GoogleCloud #DataAnalytics #Git #VersionControl #DataEngineering #CloudComputing #Collaboration #Productivity #Innovation

Riya Khandelwal

70,050 followers 4mo

SCD Type 2 looks easy… until you put it into production. Tracking history is not just about adding `start_date` and `end_date`. It’s about schema evolution, late-arriving data, deduplication, auditability, and idempotency — all at scale. This architecture shows how to solve SCD Type 2 properly using Databricks and the Medallion pattern. 🔹 Ingestion → Bronze (Raw, Immutable) Claims data lands as JSON from ADLS using Databricks Auto Loader. No business logic here — just: * Schema evolution * Ingestion metadata * Replayability If you don’t get Bronze right, everything upstream is fragile. 🔹 Bronze → Silver (Clean & Trusted) This is where engineering discipline matters: * Flatten & transform * Deduplicate records * Handle late arrivals with watermarking * Enforce data quality (valid amounts, non-null claim IDs) Silver is not “almost Gold” — it’s trusted, reusable data. 🔹 Silver → Gold (SCD Type 2 History) The real value layer: * Full historical tracking * Effective start & end dates * Change detection * Current vs historical views Using Delta Lake + Delta Live Tables, SCD-2 becomes: ✔ Declarative ✔ Idempotent ✔ Auditable ✔ Production-safe No brittle MERGE spaghetti. No manual backfills. 🔹 Why this works in real projects * Schema changes don’t break pipelines * Late data doesn’t corrupt history * Every change is traceable * Pipelines are built for scale, not demos This is how modern data platforms should handle history — not with clever SQL, but with robust architecture. 📌 𝗙𝗼𝗿 𝗠𝗲𝗻𝘁𝗼𝗿𝘀𝗵𝗶𝗽/ 𝟭:𝟭 𝗖𝗮𝗹𝗹 𝗯𝗼𝗼𝗸 𝗵𝗲𝗿𝗲 -- https://lnkd.in/dqr-vGTj 📌 𝗙𝗼𝗿 𝗚𝘂𝗶𝗱𝗮𝗻𝗰𝗲 - https://lnkd.in/dqr-vGTj 📌 𝗙𝗼𝗹𝗹𝗼𝘄 𝗺𝗲 𝗼𝗻 𝗠𝗲𝗱𝗶𝘂𝗺 𝗳𝗼𝗿 𝗧𝗲𝗰𝗵𝗻𝗶𝗰𝗮𝗹 𝗕𝗹𝗼𝗴𝘀 - https://lnkd.in/dHhPyud2 Looking for ATS friendly resume?? 𝗗𝗼𝘄𝗻𝗹𝗼𝗮𝗱 𝗥𝗲𝗰𝗿𝘂𝗶𝘁𝗲𝗿-𝗔𝗽𝗽𝗿𝗼𝘃𝗲𝗱 𝗥𝗲𝘀𝘂𝗺𝗲 𝗧𝗲𝗺𝗽𝗹𝗮𝘁𝗲-https://lnkd.in/gunTVTmz

7 Comments

John Capobianco

18,452 followers 4mo

We version and source control code with Git. But AI conversations are still ephemeral. Once you close the terminal, the reasoning path is gone. What we need in this AI Age is *context* control where version and source control used to track files and code; we can control the *context*. I’ve been working on something I think is overdue: Git for Artificial Intelligence Tracking (GAIT): a version control system for AI conversations. GAIT treats conversations as first-class artifacts: • Every turn is a content-addressed object • Every conversation state is a commit • Branches represent alternate reasoning paths • Merges represent choosing the best answer • Memory is explicit and intentional • Token usage is visible and accountable In the demo video, I show: • Chatting with a local LLM • Automatic commits per turn • Deterministic resume of context • Branching different explanations • Merging the best reasoning path • Pushing conversations to a remote • Cloning and resuming conversations anywhere I also built gaithub, a remote backend that lets you push, fetch, and clone AI conversations — just like Git. This isn’t prompt logging. This isn’t chat history export. This is version control for thinking. Context control. Couple of really neat things in no particular order: * Every Q&A is a turn and is committed automatically * Let's say you liked your first prompt and response but your second prompt / response is 'bad' and 'poisons' your context * Just revert, like Git, to the previous commit and resume * Branching! * You could make branches after your first turn to test various prompts; even swap out models; and choose to merge your branch back into main (like Git - but with context) * Resume conversations inside a gait-tracked folder * gait clone context to other machines; other people can clone *your* context and work with their own prompts or models * fork / pull / push / PR - all in the roadmap 🎥 Demo video here: https://lnkd.in/eKtJZ4CT I’d love feedback from: • AI engineers • Local framework users (Ollama, Microsoft Foundry Local, LM Studio) • DevRel & platform teams • Anyone building serious agent systems • Anyone with knowledge of Git or version/source control systems Git changed how we build software. GAIT is about changing how we build thinking. GitHub repositories: https://lnkd.in/eRwEzMph https://lnkd.in/eRmbVGD5 #AI #DeveloperTools #LLMOps #OpenSource #DevRel #Agents #versioncontrol #sourcecontrol #github #git #ollama #microsoft #chatgpt #openai #gemini #claude #LLM #futureology

Introducing GAIT: Git for Artificial Intelligence Tracking

https://www.youtube.com/

15 Comments

Christian Larsen

4,409 followers 3mo

Ever wondered how to track data changes in IBM i without writing tedious audit triggers? In this video, I explain SQL History Tables — a powerful built-in feature that automatically tracks INSERT, UPDATE, and DELETE operations on your tables. 👉 This video is perfect if you're tired of maintaining manual audit tables and want a cleaner, automated solution. 👉 And if you've never used system-period temporal tables before, this step-by-step guide will show you exactly how to implement them. I cover: ✅ Creating history tables with LIKE. ✅ Linking main and history tables with ALTER TABLE. ✅ Automatic versioning without triggers or custom code. ✅ Using implicitly hidden columns for audit tracking. ✅ Recovering accidentally deleted records from history. ✅ Unlinking and managing history tables when needed. 📺 Watch it here 👉 https://lnkd.in/eJb5knaa 🔥 Don't forget to like, share, and comment if you found it useful (or even if you didn't!) #IBMi #IBM #RPG #RPGLE #SQL #AS400 #IBMChampion

Coding in RPG (IBM i/AS400). Using SQL history tables.

https://www.youtube.com/

3 Comments

Yuriy Mosiyenko

Industrial Electrical Control Systems: Design to Commissioning | Accelerating New Production Line Startups | Functional Design & Virtual Commissioning | Siemens PLC/HMI | Eplan

5,625 followers 8mo

Commissioning new automation projects always brings this scenario: You’ve got a stable build running on the machine. Everything is in sync, the project is ticking. Then someone says, “Can you add this?” or “Change how that function works.” It feels like a quick adjustment. You open the program, edit, test… then another tweak, another fix. Soon you’ve touched five different places. And maybe the customer reverses the request. Now you’d really like to just rewind to the first stable version — but you can’t. The “undo” trail is gone, and the hours you spend retracing steps could have gone into real progress. This is why version control matters. A safe checkpoint means: - You can test boldly, knowing rollback is one click away. - You protect commissioning time from frustrating rebuilds. - You reduce risk of introducing subtle bugs at the last minute. Now, big companies may invest in specialized tools for this. Some teams try to force Git or SVN into PLC workflows. Personally, I do use Git too, and I want to try SVN, but this is a story for another post. This post is about a simple solution that may already be in your hands and doesn’t require any new software installation or investment. The good news: if you already run Microsoft 365, you already own a simple solution — SharePoint. Why SharePoint fits commissioning workflows SharePoint is more than a document storage place. Its built-in versioning lets every file overwrite become a checkpoint. Practical workflow (TIA Portal example) - Archive first locally — in TIA Portal, create an archive of your project in a normal local folder. (Never archive directly into SharePoint, because TIA deletes before writing the new file. That breaks version history.) - Move into SharePoint synced folder — just move the archive file there manually with Windows File Explorer. SharePoint saves automatically — every time you overwrite, the previous archive is retained. - Add version notes — in the SharePoint Online folder, create a new column, for example “Version comments”. - Revert anytime — click “Version history,” choose the checkpoint you trust, restore, and sync back down. Open again in TIA Portal → you’re back at stable. See screenshots for clarity. If there is an interest in a more detailed description, leave me a comment, and I may put it into an article another day. If you’re planning your next automation project, it’s worth making version control part of your commissioning plan up front. By the way, if you need some help, push that button on otomakeit.com to start a discussion. #otomakeit #efficiency #industrialautomation #controlsystems #controlpanel #Siemens #TIA #virtualcommissioning

+6

14 Comments

Arturo Ferreira

Exhausted dad of three | Lucky husband to one | Everything else is AI

5,777 followers 6mo

Spent 4 hours rebuilding a prompt that worked perfectly last month. Couldn't remember the exact wording. Lost in 200 messages of chat history. Here's why smart teams version their prompts like developers version code in 4 reasons: Reason 1: Your Best Prompts Disappear That perfect prompt from three weeks ago? Gone: ↳ Buried in endless chat threads ↳ Slightly different each time you use it ↳ No way to find the exact version that worked ↳ Team members recreating it from scratch Developers don't rewrite functions from memory. Why do we treat prompts like disposable notes? Reason 2: Small Changes Break Everything Changed one word in your prompt. Output quality dropped 40%. Which word was it? ↳ No record of what you modified ↳ Can't compare old vs new versions ↳ Impossible to isolate what went wrong Git shows you exactly what changed and when. Prompt versioning does the same for AI workflows. Reason 3: Teams Can't Scale Without Documentation Your sales team has 8 different versions of the "prospect research" prompt: ↳ Each rep wrote their own variation ↳ Quality wildly inconsistent across the team ↳ No way to identify which version performs best ↳ New hires start from zero One centralized, versioned prompt library means: ↳ Everyone uses tested, proven prompts ↳ Updates roll out to entire team instantly ↳ Performance metrics track back to specific versions Reason 4: You Can't Optimize What You Don't Track Prompt A gets 78% accuracy. Prompt B gets 91% accuracy. What's the difference? ↳ Without versions, you're guessing ↳ Can't run A/B tests on prompts ↳ Improvements get lost or forgotten Track versions, test systematically, keep what works. Developers learned this 20 years ago with code. AI teams are learning it now with prompts. Your prompts are intellectual property. Treat them like assets, not throwaway text. How do you track your best prompts right now? P.S. Want to learn more about AI? 1. Scroll to the top 2. Click "Visit my website" 3. Sign-up for our free newsletter

96 Comments

Oleg Shilovitsky

CEO @ OpenBOM | Innovator, Leader, Industry Pioneer | Transforming CAD, PLM, Engineering & Manufacturing | Advisor @ BeyondPLM

21,766 followers 1y

🔄 Why Revision Control and Change Management Matters More Than Ever 🔍 In product development, change is constant—designs evolve, suppliers shift, requirements grow. But here's the kicker: the old ways of managing these changes (hello, spreadsheets and shared folders) simply don’t cut it anymore. 🚨 Why Revision Control Is Critical Revision control isn’t just about tracking edits—it’s about building a reliable system to manage product evolution. It provides full traceability, ensures consistency, and supports compliance across the lifecycle of your product. Without it, you're flying blind. 🤝 The Modern Twist: Collaborative, Multi-Team Development Modern product development involves mechanical, electrical, and software teams all working in parallel. Changes in one domain impact the others. A robust revision system must not only track changes—but synchronize them across disciplines, ensuring everyone stays aligned. 💡 How OpenBOM Makes It Work. At OpenBOM, we reimagined revision control for today’s collaborative, connected teams. Our Collaborative Workspace lets multiple contributors work on structured product data—Items, BOMs, CAD files—in real-time, with full version and revision tracking. ✅ Instant visibility into updates ✅ Baseline snapshots for audit and compliance ✅ Structured ECO workflows to manage formal change With OpenBOM, you're managing a virtual collaborative data space where users can make changes simultaneously and once not just managing revisions—you’re building a connected digital thread across design, engineering, and manufacturing. 📘 Read the full article here [link in the comments] 🔁 Interested to discuss how to support mechanical, electronics, and software teams to work together and control changes? Let's talk. #PLM #PDM #RevisionControl #DigitalThread #OpenBOM #Collaboration #Engineering #Manufacturing #ProductDevelopment #ChangeManagement

2 Comments

Pierre Brunelle

Multimodal Data Infra: github.com/pixeltable/pixeltable

7,704 followers 9mo

Pixeltable automatically versions every change to your data and schema - no configuration required. This isn't just backup functionality; it's a core architectural feature that enables: 1. Automatic Version Tracking - Every table operation (inserts, updates, deletes, schema changes) creates a new version - Monotonically increasing version numbers starting from 0 - Separate tracking for data versions vs schema versions - Complete metadata preservation for each version 2. Time Travel Functionality - Access any historical version of your table current_table = pxt.get_table('my_table') historical_table = pxt.get_table('my_table:722') # Version 722 older_version = pxt.get_table('my_table:15') # Version 15 3. Instant Revert Operations - Made a mistake? Undo it instantly table.revert() # Undoes the last operation table.revert() # Keep going back further 4. Immutable Snapshots - Create point-in-time snapshots for experiments snapshot = pxt.create_snapshot('experiment_v1', my_table) - Snapshots are immutable - perfect for reproducible research - Base table can continue evolving independently 5. Complete History Reporting - See exactly what happened and when history = table.history() - It shows: version, timestamp, user, operation type, row counts, schema changes... 6. View & Snapshot Lineage - Views and snapshots reference specific base table versions - Creates a full lineage DAG (Directed Acyclic Graph) - Prevents accidental deletion of referenced versions 7. Configurable Retention - Control how many versions to keep pxt.create_table('my_table', schema, num_retained_versions=50) pxt.create_view('my_view', base_table, num_retained_versions=20) ----- Overall it's zero-overhead ----- - Versioning is built into the storage layer - No performance penalty for version tracking - Every transformation, model inference, and data change is versioned - Full lineage tracking from raw data to final results - Perfect for experimentation tracking and audit trails

Organizing Digital Files Efficiently

Introducing GAIT: Git for Artificial Intelligence Tracking

https://www.youtube.com/

Coding in RPG (IBM i/AS400). Using SQL history tables.

https://www.youtube.com/

More in Organizing Digital Files Efficiently

More Productivity topics

Explore categories