Tools For Project Scheduling

Explore top LinkedIn content from expert professionals.

  • View profile for Puneet Patwari

    Principal Software Engineer @Atlassian| Ex-Sr. Engineer @Microsoft || Sharing insights on SW Engineering, Career Growth & Interview Preparation

    71,296 followers

    You're sitting in an L5-level system design interview at Google, and you've just been told to design a distributed job scheduler. You’ve done job schedulers before. Great. But it only takes one extra constraint to turn something “simple” into a headache: → Suppose they add DAG-based execution and now you’re managing dependency ordering → Suppose they add millions of jobs/day and suddenly your scheduler table must survive hell → Suppose they add multi-level executors (cheap vs expensive hardware) and now you’re in OS-level scheduling territory Before you know it,  your “simple scheduler” becomes a mini Airflow + Cron + Kafka hybrid. Here’s my personal checklist of 15 things you must get right when designing a distributed job scheduler: 1. Store binaries in object storage Never ship code through your backend. Users upload binaries/scripts → you store them in S3/GCS → executors download directly. 2. Separate Cron jobs and DAG jobs Cron needs predictable time-based triggering. DAGs need dependency resolution + epoch tracking. Do NOT mix both in one table. 3. Topologically sort DAGs on upload Users will dump random graphs. You must determine roots, order, and execution sequence. 4. Pre-schedule only the next Cron run Not all future runs. Only the *upcoming* job instance goes into the scheduler table. 5. Each job must have a “run_at” timestamp Schedulers poll: `SELECT * FROM tasks WHERE run_at <= NOW() AND status = 'pending'` 6. Update run_at as soon as execution starts Add +5 or +10 min. This prevents retry storms and ensures clean scheduling timeouts. 7. Executors pull, not receive pushed tasks Pulling avoids overload, simplifies horizontal scaling, and prevents blind pushes.  8. Use an in-memory message broker for load balancing Kafka = bad for job schedulers (partition lock-in). ActiveMQ/RabbitMQ = executors pick tasks only when idle.  9. Use multi-level priority queues Think OS scheduling: Level 1 → cheap nodes Level 2 → standard Level 3 → high-power nodes Long-running tasks get escalated. 10. Use distributed locks for “run once” semantics Zookeeper lock per job ID → prevents simultaneous execution on multiple executors. 11. Accept that some jobs may run twice Make jobs idempotent. Use versioned writes. Retry logic will inevitably double-fire something. 12. Maintain a status table with final outcomes Users should see: pending, running, success, failed, error logs. 13. Use read replicas for user-facing status Never let users hit the primary scheduler DB. 14. Shard scheduler table by job_id + time range Millions of rows. High churn. Without sharding, your entire system becomes a single-point bottleneck. 15. Use change-data-capture (CDC) instead of 2-phase commits When DAG nodes complete → update DAG table → emit CDC event → enqueue next node. No locking hell. No cross-table multi-row transactions.

  • View profile for Aakash Gupta
    Aakash Gupta Aakash Gupta is an Influencer

    Helping you succeed in your career + land your next job

    312,433 followers

    Every weekday at 7:30 AM, I get a one-paragraph brief for every meeting on my calendar. Last email threads with each participant, open asks, unresolved questions. Claude wrote it while I was asleep. Anthropic shipped three automation tools in four weeks. Two serve you individually. One serves your whole team. The routing decision is simple. Work needs your local files? Cowork Scheduled Tasks. Runs on your machine, reads ~/Documents. Needs to fire while your laptop is closed? Claude Routines. Cloud infrastructure. Competitor checks at 7 AM, sentiment scans on Monday morning, pre-meeting briefs before you wake up. Pro plan gets 5 runs/day. Max gets 15. Needs to serve more than just you? Managed Agents. Every PM queries the same agent, each with their own session and audit trail. Asana, Notion, Rakuten, and Sentry are already running these in production. Rakuten went from quarterly releases to biweekly. The reasoning step is what separates this from Zapier. A Zapier zap chains deterministic actions. A Routine reads a competitor pricing page, decides whether something meaningful changed, and writes a summary in your voice. Different category of work. I set up a competitor pricing monitor in 20 minutes. It visits three competitor pages every morning, compares against yesterday's Notion log, and posts only what changed to Slack. I know about pricing shifts before my sales team hears them on calls. A weekly sentiment scanner does the same thing across Reddit, G2, and Product Hunt. Four weeks of consistent themes tells you what users actually want, not what's loudest internally. I built 7 of these workflows with full prompts, connector setup, failure modes, an engineer handoff brief, and a security doc: https://lnkd.in/gyb4FkHa The PM who walks into Monday planning with automated intelligence will out-prioritize the one going off memory and escalations. That gap compounds every week.

  • View profile for Spencer Dorn
    Spencer Dorn Spencer Dorn is an Influencer

    Vice Chair & Professor of Medicine, UNC | Balanced healthcare perspectives

    19,867 followers

    Appointment scheduling is an unglamorous, under-discussed yet prime area for harnessing AI. Think of it as a matching problem. How do we schedule patients with the right clinicians, at the right time, at the right place, and with the right concurrent services for their specific needs? However, assigning patients to the right pathway can be painstaking. Sometimes, non-clinical staff (or sometimes nurses) sift through long records to find the information needed to make scheduling decisions. Other times, patients are simply scheduled haphazardly. This is especially challenging in the UK, where patients are routinely placed on very long waiting lists, and some deteriorate while waiting for their appointment. Here, The Times explains how C2-Ai’s system reviews waitlists to identify patients to prioritize for sooner care and/or who need coaching before surgery. The company reports impressive results on its website (e.g., 99% clinician agreement, 8% reduction in emergency admissions, 125 saved bed-days per 1,000 patients, and five minutes saved per patient triage). This is a very pragmatic, valuable AI use case. I see clear opportunities to apply AI both earlier and later in referral processing and scheduling workflows. First applying a blend of AI to process referrals and guide scheduling decisions (avoiding wait lists when possible + necessary). Later, applying Gen AI to create referral/patient summaries for clinicians to quickly learn about who they are about to see. Though this may not be as exciting as AI for diagnosis or as widely discussed as AI for tasks like note writing, it’s quite practical, attainable, and impactful.

  • View profile for Mike Rizzo

    Brand partnership Certifying GTM Ops Professionals. Community-led Founder & CEO @ MarketingOps.com and MO Pros® - where 4,000+ Marketing Operations, GTM Ops, and Revenue Ops professionals architect GTM products.

    19,902 followers

    If you talk to enough GTM operators and the RevOps leaders supporting them, you’ll hear the same frustration: “We fix everything upstream, and scheduling still finds a way to break.” A rep grabs the wrong calendar. A handoff gets messy. Enrichment lags. Ownership rules get ignored. And a qualified prospect sits in limbo or disappears entirely. Everyone feels the pain, yet nobody truly owns the fix. We solved routing. We solved scoring. We solved attribution. But scheduling (the moment with revenue on the line) stayed detached from the system designed to govern it. It looks tiny from the outside, but scheduling carries the load of the whole GTM engine. It’s where logic, data, timing, and fairness collide. Most tools don’t understand any of that. They treat booking a meeting as a click, not a system event. That gap is why I’ve been paying attention to what Default is launching today. Their new Chrome extension brings orchestration logic directly into Gmail, Salesforce, and the places reps live every day. Before a rep even sees the calendar, Default is already evaluating: — Multi-object routing — Enrichment waterfalls — Account hierarchies — Qualification rules — Fairness and load balancing — Booker attribution — SLAs and follow-up workflows Only then does it show time slots. The extension becomes a distributed front-end for RevOps, your logic follows the rep, not the other way around. ➡ Handoffs stay intact. ➡ Ownership stays accurate. ➡ Meeting workflows fire cleanly. ➡ Debugging becomes observable rather than guesswork. The meeting reflects the system, not rep improvisation. For operators, this moves us closer to something we’ve been chasing for years: a GTM engine that behaves the way it was actually designed. Who else is excited? #RevOps #MarketingOps #Scheduling #LeadRouting #DefaultPartner #GTM

  • View profile for Hrittik Roy

    Platform Advocate at vCluster | CNCF Ambassador | Google Venkat Scholar | CKA, KCNA, PCA | Gold Microsoft LSA | GitHub Campus Expert 🚩| 4X Azure | LIFT Scholar '21|

    12,356 followers

    Scheduling in Kubernetes happens in various ways. Depending on the workload, you might need different algorithms like 𝗚𝗮𝗻𝗴 𝗦𝗰𝗵𝗲𝗱𝘂𝗹𝗶𝗻𝗴. Volcano, a CNCF project, supports this and can optimize complex workflows such as AI training, inference pipelines, and distributed data processing.  🚀 𝗪𝗵𝗮𝘁 𝗶𝘀 𝗚𝗮𝗻𝗴 𝗦𝗰𝗵𝗲𝗱𝘂𝗹𝗶𝗻𝗴? Gang scheduling ensures all pods in a group ("gang") start simultaneously or none do. This prevents partial execution, which is critical for interdependent tasks like distributed training or multi-stage AI pipelines. Without it, a single delayed pod could stall an entire workflow, wasting resources. 𝗘𝘅𝗮𝗺𝗽𝗹𝗲: In distributed AI training, if three worker pods are needed, Volcano’s gang scheduler waits until all 3 are available. If even one fails to schedule, the scheduler releases reserved resources to avoid cluster deadlocks. ⚡ 𝗪𝗵𝘆 𝗩𝗼𝗹𝗰𝗮𝗻𝗼? Volcano extends Kubernetes’ default scheduler to handle batch workloads and multi-pod dependencies. It’s ideal for: → AI/ML workflows (e.g., TensorFlow/PyTorch jobs). → Big Data processing (Spark, Flink). → High-performance computing (HPC). Key features: ✅ PodGroup orchestration: Treats multiple pods as a single schedulable unit. ✅ Fair-share resource allocation: Balances cluster resources across teams. ✅ Preemption/Reclaim: Prioritizes critical workloads without manual intervention. 🌟 𝗥𝗲𝗮𝗹-𝗪𝗼𝗿𝗹𝗱 𝗨𝘀𝗲 𝗖𝗮𝘀𝗲 Imagine training a large language model (LLM) across 3 GPUs. With gang scheduling: → Volcano groups all worker pods into a PodGroup. → The scheduler reserves resources only when all 3 GPUs are available. → If a node fails, Volcano retries or releases resources instantly, avoiding idle clusters. This eliminates "resource hoarding" and ensures cost-efficient scaling for AI teams. #Kubernetes #mlops

  • View profile for Aryan Irani

    I write and create on the internet.

    6,248 followers

    I spend a huge part of my week just managing my calendar — finding free slots, rescheduling meetings, dealing with recurring events, and juggling multiple time zones. It’s tedious and eats into real work. That’s why I decided to build my own solution: a Google Calendar AI agent powered by Google’s Agent Development Kit. This agent can: 👉 Understand plain English commands like “Schedule a 1-hour call with Alex next Tuesday morning”. 👉 Suggest free time slots based on my existing calendar. 👉 Handle recurring events, cancellations, and attendees automatically. 👉 Work across time zones without any manual conversion. While building this, I learned something crucial: AI isn’t just about generating text — it can actually perform actions that solve real problems. Designing this agent taught me how to bridge natural language understanding with real-world API actions. I wrote a detailed step-by-step blog, including code snippets and logic, so anyone can replicate this setup or build their own AI productivity assistant: https://lnkd.in/dsDhtcMr #AIAgents #AgentDevelopmentKit Google Cloud #GoogleAI #GoogleCalendar #CalendarManagement #AgenticAI

  • View profile for RAVI JOSHI

    Lead Project Planner | Primavera P6 Scheduler | PMO | Project Controller | Energy, EPC & Nuclear | EVM | CPM | Baseline | Planificateur | 16 Years | Available Europe | Project Scheduling, Planning & Controls | Power BI

    1,493 followers

    An effective Primavera P6 schedule is not just a timeline. It is the foundation of project control. A strong schedule is built with discipline, logic, and a clear workflow. For Primavera P6 Planner, Scheduler, Lead Project Planner, and Project Controls roles, the quality of the schedule directly affects project visibility, decision-making, and delivery reliability. A structured Primavera P6 workflow should follow a clear sequence: 1 Create OBS 2 Create EPS and link it with OBS 3 Create calendar 4 Create new project 5 Check calendar 6 Check OBS / responsible manager 7 Create WBS and breakdown structure 8 Create activities 9 Change type of activities if required 10 Assign relationships 11 Check relationship type: FS, SS, FF, SF 12 Check lag / lead and make changes if required 13 Schedule to check the changes 14 Save the schedule log 15 Make changes to data date 16 Decide whether the project needs resources or cost 17 Create resources 18 Ensure resource ID has unique value 19 Set resource type: Labor / Non-Labor / Material 20 Set default units/time as per calendar 21 Adjust max units/time to avoid overallocation 22 Set price / unit 23 Assign resources to activities one by one to get activity cost 24 Create baseline to preserve timeline and cost 25 If no resources or cost are needed, create baseline directly 26 Assign the baseline as Project Baseline 27 Update status of activities Why this workflow matters? A disciplined Primavera P6 workflow improves schedule quality, protects logic integrity, supports baseline control, and gives the project team a reliable platform for tracking progress, resources, cost, and delays. In EPC projects, this is essential for maintaining control over time and delivering with confidence. Key best practices: • Build the schedule from a proper OBS, EPS, and WBS structure • Use correct relationships and lag/lead logic before moving to resource loading • Check the schedule after every major change, not only at the end • Keep the baseline clean, approved, and traceable • Update actual progress consistently using the correct data date Common mistakes planners make in Primavera P6; • Creating activities before the WBS structure is properly defined • Using incorrect or missing logic relationships • Overloading resources without checking max units/time • Forgetting to save schedule logs and maintain traceability • Updating progress without protecting the baseline integrity A good Primavera P6 schedule is not built by chance. It is built through a structured workflow, accurate logic, and disciplined control. What is the one Primavera P6 habit that has improved your schedule quality the most? #PrimaveraP6 #ProjectPlanning #ProjectControls #LeadProjectPlanner #Scheduler #ProjectPlanner #EPC #WBS #OBS #BaselineManagement #CriticalPath #ResourcePlanning #DelayAnalysis #PMO

  • View profile for Kristian Johannesen

    Databricks Champion | Consulting Manager & Senior Architect @twoday Data & AI

    3,210 followers

    Table-based triggers in Databricks is now GA! 👀 Stop triggering based on the time when what you really care about is the data! If you’ve been using Databricks Workflows for a while, chances are that most of your jobs still run because the clock says so⏰ Chron schedules are useful for a lot of use cases, but up until recently, they were almost the only good solution we had for proper scheduling. Runs would be scheduled hourly, nightly or weekly. But that also meant that your pipeline would run, whether new data arrived or not 👎 Sure, you could use file-arrival triggers. But for Delta Table updates, a lot of small files can arrive - and we should only run when the full set of files in a transactions are committed. You could do some workarounds to make this work, but ultimately they were all sub-optimal 👎 Table-based triggers let you start a job when one (or more) Delta table are updated 🔄️ - Not via polling - Not on a fixed schedule ... But exactly when the table changes have been applied: new rows, updates, merges, new versions 👍 This shifts orchestration from time-driven to data-driven: 🚀 Lower latency - no waiting for the next window 🔗 Better dependencies between jobs and tables 💰 No wasted runs when nothing changed An added benefit of this is also, that it allows you to split responsibility of layers or tables across different people or departments. Instead of trying to map out a complete set of workflows, each flow can depend on a set of key tables, allowing a more smooth and decentralized scheduling 🙌 Using the Advanced Settings you can set: - Any or All clauses between your selected tables - Minimum wait times between triggers - Wait times after last change Below I have added an example. My favorite way of setting up triggers for a source system that is updated daily, inside a Data Platform used for both BI reporting and system updates: ✅ Create a Scheduled Trigger on the job that is used to import data to the platform ✅ Create a Table Trigger for each of the downstream jobs - triggering each job based on the specific data they need. A few limiting factors to note: ⛔ A trigger can only depend on a maximum of 10 different tables. ⛔ Using views does not help. It will count each of the underlying tables in the view. ⛔ Non Unity Catalog tables are not supported - e.g. Federated Queries.

  • View profile for Praveen Singampalli

    Helping Students & Professionals Get Jobs | Built 300k+ DevOps Family Across Socials | AWS Community Builder | Ex-Verizon | Ex-Infosys | 8x SSB Conference Out

    140,942 followers

    𝐊𝐮𝐛𝐞𝐫𝐧𝐞𝐭𝐞𝐬 𝐂𝐨𝐬𝐭 𝐎𝐩𝐭𝐢𝐦𝐢𝐳𝐚𝐭𝐢𝐨𝐧 𝐓𝐞𝐜𝐡𝐧𝐢𝐪𝐮𝐞𝐬 Kubernetes, a powerful container orchestration platform, can significantly reduce costs when used effectively. Here are some key strategies for optimizing your Kubernetes environment: 1. Rightsizing and Resource Allocation Pod Limits and Requests: Set precise resource limits and requests for each pod to prevent over-allocation and under-utilization. Node Sizing: Choose the appropriate node size based on your workload requirements to avoid paying for excess resources. Horizontal Autoscaling: Automatically scale pods up or down based on demand to ensure optimal resource utilization. Vertical Autoscaling: Adjust the resource allocation for pods to match their workload requirements. 2. Cost Monitoring and Analysis Utilize Cloud Provider Tools: Leverage cloud-specific tools (e.g., AWS Cost Explorer, GCP Cost Management) to track spending and identify cost-saving opportunities. Third-Party Tools: Consider using tools like Kubecost or Prometheus for detailed cost analysis and visualization. Regular Reviews: Regularly review your cost data to identify trends and areas for optimization. 3. Spot Instances and Preemptible VMs Leverage Spot Instances: Use spot instances or preemptible VMs for non-critical workloads to significantly reduce costs. Implement Fault Tolerance: Ensure your applications can handle interruptions caused by spot instance terminations. 4. Image Optimization Minimize Image Size: Remove unnecessary files and layers from your container images to reduce download and storage costs. Use Multi-Stage Builds: Create optimized images by building in multiple stages and copying only necessary artifacts. 5. Network Optimization Network Policies: Use network policies to restrict traffic between pods and reduce unnecessary network traffic. Load Balancing: Implement efficient load balancing strategies to distribute traffic evenly across pods. 6. Storage Optimization Persistent Volume Claims (PVCs): Use PVCs to manage persistent storage efficiently and avoid over-provisioning. Storage Classes: Create storage classes to define different storage types and their associated costs. Storage Provisioners: Choose appropriate storage provisioners based on your workload requirements and cost considerations. 7. Cluster Sharing Consolidate Clusters: If possible, consolidate multiple clusters into a single, shared cluster to reduce overhead costs. Namespace Isolation: Use namespaces to logically isolate different workloads within a shared cluster. 8. Consider Managed Kubernetes Services Evaluate Managed Offerings: Explore managed Kubernetes services (e.g., EKS, GKE, AKS) that often provide cost-effective solutions and managed infrastructure. Check here for more kubernetes Projects - https://lnkd.in/g5jCpiQg Share this post with your devops friends :)

Explore categories