
Closed
Posted
Paid on delivery
Project Description: We are looking for an experienced PyTorch optimization expert to accelerate the DOVE (Video Super-Resolution) model. The goal is to achieve an end-to-end inference speedup of 1.5x to 1.8x using strictly training-free methods. Acceptable Optimization Techniques (Training-Free only): You are free to explore and combine the following training-free approaches: Token-level routing, Token Merging (ToMe), or Token Pruning. Post-Training Quantization (PTQ). Attention simplification (e.g., efficient attention mechanisms). Coordination/Synergy optimization between VAE and DiT. Core Requirements: Target Model: DOVE (Repository: [login to view URL]) Acceleration Target: 1.5x - 1.8x end-to-end inference speedup. Hardware Baseline: The speedup must be achieved and evaluated on a high-end NVIDIA GPU with 40GB+ VRAM (specifically targeting NVIDIA L40S, H100, or equivalent). Testing Condition: The inference speed MUST be measured under the "no tiling" setting. Quality Metrics Constraints: Image Quality Assessment (IQA) metrics and Temporal metrics must NOT decrease compared to the original DOVE baseline. PSNR and SSIM are allowed to drop, but the degradation must strictly not exceed 8%. Tech Stack: Native PyTorch. Current Progress: We already have a preliminary working version implemented with Token Merging. This can be provided to the hired freelancer as a baseline/reference. Budget & Timeline: Budget: 700Euros (Fixed Price upon successful delivery and testing). Timeline: 1 Month (4 Weeks). To Apply, Please Provide: Briefly describe your proposed training-free pipeline (e.g., which combination of pruning/quantization/attention simplification you plan to use) and confirm your availability for the 1-month timeline.
Project ID: 40423900
80 proposals
Remote project
Active 3 days ago
Set your budget and timeframe
Get paid for your work
Outline your proposal
It's free to sign up and bid on jobs
80 freelancers are bidding on average €525 EUR for this job

Hello, I understand you want to speed up the DOVE video super-resolution model on NVIDIA GPUs using training-free methods, while keeping IQA/temporal metrics within 8% degradation. My approach is to build a compact, multi-stage optimization plan that focuses on token-level efficiency, PTQ, and attention simplifications, with careful validation against the no-tiling setup on 40GB+ GPUs. I will start from your Token Merging baseline, then combine token pruning, ToMe-like routing, and lightweight attention variants, followed by post-training quantization and cross-module coordination tweaks between VAE and DiT. The goal is to reach 1.5x-1.8x end-to-end speedup with minimal quality loss, verified by PSNR/SSIM/temporal metrics and strict no-tiling measurements. I propose a concise, repeatable pipeline: - Evaluate cache-friendly operator fusion and memory layout optimizations to reduce latency. - Apply targeted token-level reductions and dynamic routing to minimize active tokens without hurting accuracy. - Implement PTQ with per-layer calibration for minimal accuracy drift. - Integrate efficient attention variants and VAE/DiT coordination tuning, ensuring end-to-end speedups. - Rigorously test on the specified GPU, compare to baseline, and document results. What is the exact baseline DOVE commit you want me to use, and do you have any preferred PTQ toolchain constraints (e.g., specific PyTorch version or CUDA toolkit)? Best regards, Muhammad Awais
€750 EUR in 16 days
7.2
7.2

Hello, I trust you're doing well. I am well experienced in machine learning algorithms, with nearly a decade of hands-on practice. My expertise lies in developing various artificial intelligence algorithms, including the one you require, using Matlab, Python, and similar tools. I hold a doctorate from Tohoku University and have a number of publications in the same subject. My portfolio, which showcases my past work, is available for your review. Your project piqued my interest, and I would be delighted to be part of it. Let's connect to discuss in detail. Warm regards. please check my portfolio link: https://www.freelancer.com/u/sajjadtaghvaeifr
€500 EUR in 7 days
7.2
7.2

As an experienced Machine Learning expert, I believe I am the perfect fit for your project, "Speedup VSR Model with PyTorch Optimization." Having worked on numerous optimization projects with PyTorch and proficient in training-free methods, I can bring the right set of skills to accelerate the DOVE model while adhering to strict quality metrics constraints. On your project's core requirements, I have extensively deployed various approaches like Token-level routing, Token Merging (ToMe), Token Pruning, Post-Training Quantization (PTQ) which synergistically optimize Native PyTorch models' performance. Additionally, my solid comprehension of efficient attention mechanisms promises a viable solution for attention simplification. To ensure a harmonized VAE and DiT relationship, a critical requirement in your project, I have previously worked on coordination optimization tasks. Budget and Timeline-wise, we guarantee timely delivery without compromising quality. With my team at Live Experts, we've consistently satisfied clients like you by transforming their imaginations into tangible solutions. Your satisfaction is paramount to us. Allow me and my team to materialize your vision.
€750 EUR in 7 days
6.9
6.9

I understand the importance of achieving an impressive speedup in your DOVE model, while maintaining top-notch image quality. Building on the preliminary work you've done with Token Merging, I propose an innovative training-free pipeline that leverages Token Pruning and Post-Training Quantization to push the boundaries of PyTorch optimization. I'm Doan, a seasoned Full-Stack Developer, experienced in utilizing these techniques to minimize computational overhead without compromising output quality. Combining this with my adeptness at attention simplification, we can significantly enhance inference efficiency. My qualifications extend beyond Token-level routing and quantization. With expertise in various deep learning frameworks including PyTorch, I have successfully delivered projects with similar accelerated demands like sentiment classification and object detection within the specified time frame. Let's collaborate to revolutionize your DOVE repository with cutting-edge advancements while adhering to your budgetary constraints. My full attention will be dedicated towards ensuring that PSNR and SSIM degradation remains well within allowable bounds and IQA metrics are not compromised. Together, we can redefine the horizon of video super-resolution. So let's code videos at extreme speeds!
€500 EUR in 7 days
5.5
5.5

Hi, thanks for outlining the DOVE optimization work so clearly. You need a training‑free speedup on a heavy VSR model without hurting IQA or temporal stability, and that’s doable with the right mix of pruning and PTQ. I’ve worked on similar PyTorch acceleration tasks for diffusion and VAE‑based systems. I’d focus on concrete changes: • Merge and prune tokens adaptively per‑frame • Apply PTQ only on attention and MLP blocks with safe ranges • Replace attention with an efficient kernel • Profile VAE-DiT sync points to cut overhead I can start right away and fit the one‑month window. Which parts of DOVE’s inference graph currently show the highest kernel‑level latency in your profiler? Regards, Slavko
€250 EUR in 5 days
4.2
4.2

⭐⭐⭐⭐⭐ Optimize DOVE Model for Enhanced Video Super-Resolution Performance ❇️ Hi My Friend, I hope you're doing well. I've reviewed your project needs and see that you're looking for a PyTorch optimization expert for the DOVE model. You don’t need to look any further; Zohaib is here to assist you! My team has successfully completed over 50 projects in model optimization. I will utilize training-free techniques to achieve the desired speedup while maintaining quality metrics. ➡️ Why Me? I can efficiently optimize your DOVE model to achieve a 1.5x to 1.8x inference speedup. With 5 years of experience in PyTorch and model optimization, I specialize in techniques like token merging and post-training quantization. My strong grip on GPU performance tuning ensures effective results. ➡️ Let's have a quick chat to discuss your project in detail and let me show you samples of my previous work. I'm excited to collaborate and achieve your goals! ➡️ Skills & Experience: ✅ PyTorch Optimization ✅ Model Acceleration ✅ Token Merging ✅ Post-Training Quantization ✅ Attention Mechanisms ✅ GPU Performance Tuning ✅ Video Super-Resolution ✅ Deep Learning Techniques ✅ Quality Metrics Evaluation ✅ Code Optimization ✅ Data Processing ✅ Performance Analysis Looking forward to your response! Best Regards, Zohaib
€350 EUR in 2 days
5.0
5.0

hi, i have reviewed the details of your project. i can work on optimizing your existing token merging baseline and improve it further using a combination of token pruning, efficient attention mechanisms, and selective post training quantization while keeping the pipeline fully training free. i will focus on reducing compute overhead between vae and dit components and removing redundant attention and token operations to improve throughput on l40s or h100 level hardware. i will carefully benchmark under no tiling settings and ensure quality constraints are respected, especially keeping iq a and temporal consistency stable within your limits. i am available for the full 1 month timeline and can start immediately. can we schedule a quick meeting to discuss the project in detail. it will help me understand your setup better and give you a clear execution plan and milestones. i will also share my portfolio during the chat.
€500 EUR in 7 days
4.0
4.0

Hi, I am a seasoned Applied ML Engineer(6+ yoe) & I can help accelerate DOVE with strictly training-free optimization,targeting the requested 1.5×–1.8× end-to-end speedup while keeping quality degradation within the defined limits >>My approach will be benchmark-first: I will first reproduce the original DOVE inference under the required no-tiling setting,record per-video latency,GPU memory,preprocessing time,model time,save time,& quality metrics >>I will then apply safe PyTorch-native optimizations: float16/bfloat16 execution depending on GPU,torch.inference_mode(),optimized CUDA memory handling,SDPA/efficient attention paths,VAE slicing/tiling controls for development >>For model-level acceleration,I will evaluate Token Merging / Token Pruning conservatively: starting with low merge ratios,applying them only to middle/later DiT blocks & avoiding early/final layers to preserve spatial detail & temporal consistency >>I will also test post-training quantization on transformer linear layers first,not the VAE,since VAE degradation can visibly hurt reconstruction quality. INT8/FP8 will be benchmarked depending on hardware support >>Deliverables will include optimized PyTorch code,reproducible commands,per-run logs,timing CSV/JSON,ablation table & before/after quality comparison using PSNR,SSIM,LPIPS/DISTS/CLIPIQA >>I will not blindly promise 1.8×; I will implement measurable training-free variants & select the fastest configuration that passes the quality constraints
€500 EUR in 7 days
4.2
4.2

Hello, I will accelerate the DOVE model to achieve an end to end inference speedup of 1.5x to 1.8x using strictly training free methods. I would be happy to share my portfolio of similar projects via chat. Let's connect and get this moving forward. Best regards, Fahad.
€300 EUR in 15 days
3.9
3.9

Hello! I am a US-based senior software engineer with extensive experience in AI and deep learning technologies. I’ve carefully read your project description regarding the VSR model optimization with PyTorch, and I’m excited about the opportunity to help accelerate its performance. With over 15 years in the field, I've developed a strong background in machine learning and AI model development. I understand the intricacies involved in optimizing models for better speed and efficiency. My approach involves not just technical execution but also ensuring the solution aligns with your project goals. To clarify a few important points that will guide my optimization strategy: Could you please clarify the following questions to help me better understand the project? 1. What specific performance metrics are you aiming to improve with the VSR model? 2. Are there any existing benchmarks or constraints I should be aware of during the optimization process? I believe in clear communication and structured milestones to ensure we stay aligned throughout the project. If you're looking for someone who pays close attention to detail and can deliver optimal results, let’s connect! Looking forward to the possibility of collaborating on this exciting project. Best, James Zappi
€600 EUR in 3 days
2.0
2.0

Hi, I see you want to accelerate the DOVE video super-resolution model by 1.5x–1.8x using training-free methods, without degrading image quality beyond allowed limits. I’ve optimized PyTorch models with token pruning, attention simplification, and post-training quantization before, achieving substantial speedups while keeping IQA metrics intact. For DOVE, I’d combine Token Merging with lightweight attention simplification and selective PTQ, carefully coordinating the VAE and DiT modules to minimize any quality loss. Your baseline implementation gives a perfect starting point for iterative speed testing on L40S/H100-class GPUs. I can commit to the 1-month timeline and provide end-to-end speedup tests, PSNR/SSIM/IQA metrics verification, and a fully documented pipeline for reproducibility. Best regards, Joel M.
€700 EUR in 7 days
2.2
2.2

Hello There, As per my understanding you want to optimize the DOVE video super resolution model to achieve a 1.5x to 1.8x inference speedup using methods that do not require retraining on premium NVIDIA hardware. 1) Does your existing Token Merging baseline already achieve a specific speedup percentage that we need to exceed? 2) Are you open to using Flash Attention 2 or should the solution remain strictly within base PyTorch libraries for maximum portability? 3) Do you have a specific test dataset or video length benchmark for the no tiling measurement? I will take your video processing model and make it run significantly faster without compromising the visual quality your users expect. You will get a high performance pipeline that handles high resolution video processing on your GPU hardware much more efficiently, reducing your operational costs and wait times. My goal is to give you a production ready solution that maintains smooth temporal consistency while hitting your exact acceleration targets. I will implement a multi layered optimization strategy starting with a more aggressive Token Pruning and Merging schedule specifically tuned for the DOVE architecture. I will explore Post Training Quantization for the DiT layers to reduce memory bandwidth bottlenecks and implement efficient attention mechanisms to handle the no tiling requirement. Best regards, Bharat Joshi
€700 EUR in 25 days
2.1
2.1

Leveraging on my proven expertise as a Full-Stack Developer, and an AI/ML specialist, I can guarantee to deliver top-level optimization for your DOVE model. In order to achieve your target acceleration of 1.5x - 1.8x by exploring only training-free methods, I propose a dynamic approach. My plan involves implementing a combination of Token Pruning for token-level routing, Post-Training Quantization, and coordination optimization between VAE and DiT. My proficiency in TensorFlow and especially PyTorch -- as stated in my tech stack -- puts me in an excellent position to understand the existing DOVE model codebase easily. I'm also very comfortable using high-end NVIDIA GPUs such as those targeted (L40S, H100 or equivalent) for testing the acceleration speedup conditions. Having worked on projects involving AI chatbots, NLP, predictive analytics, computer vision just to mention a few, I’ve found ways to retain and often improve quality measures like Image Quality Assessment metrics and Temporal metrics. I assure you that with my knowledge and dedication & the preliminary work completed on Token Merging as our baseline, we'll attain your target speed-up goal within the one-month timeline while observing the strict quality constraints. Choose me for reliable, efficient counsel and execution of this project! With my broad skills base from UI/UX design to SEO & Digital Growth, I will approach this task with meticulousness to ensure every requirement is met perfectly at all levels
€445 EUR in 7 days
2.2
2.2

As a seasoned software engineer and machine learning specialist, I believe my extensive experience in the field can be effectively leveraged to accelerate the DOVE model without compromising on image quality. My approach involves integrating Post-Training Quantization (PTQ) with Attention simplification techniques, ensuring that essential quality metrics such as Image Quality Assessment (IQA) are not affected. By minimizing dependencies on the training process, I'm confident of delivering an end-to-end inference speedup within your desired range of 1.5x - 1.8x. The flexibility in approaches is another strength I bring to the table as I'm comfortable exploring various token-level routing, token merging (ToMe) or pruning strategies combined with training-free methods to achieve optimal outcomes. I've worked extensively with PyTorch in modeling and optimizing complex deep learning architectures and will prioritize delivering high performance without exceeding PSNR or SSIM degradation above 8%. Additionally, my familiarity and proficiency with Native PyTorch combined with my expertise in handling high-end NVIDIA GPUs with 40GB+ VRAM makes me an ideal fit for this project. Given the preliminary working version you already have implemented, I am confident that we can achieve amazing results within a month's time frame. Interested in a desirable outcome? Partner with me!
€563.33 EUR in 2 days
1.8
1.8

As a skilled and passionate AI professional, I am confident in my ability to optimize your DOVE model while strictly adhering to your training-free approach. I understand the importance of accelerating the model without compromising its image quality assessment (IQA) metrics and temporal metrics, which is precisely why I've already begun brainstorming which combination of token-level routing, post-training quantization, attention simplification, and coordination/synergy optimization would best suit your specific needs. Rest assured that I'll leave no stone unturned in achieving your target of a 1.5x - 1.8x end-to-end inference speedup on the targeted high-end Nvidia GPU with no degradation exceeding 8%. With regards to my experience, I have an in-depth understanding and demonstrated proficiency in PyTorch that will be indispensable for this project. Furthermore, I have hands-on experience implementing various AI models and techniques in my previous projects, including token merging similar to what your team has already started with. While I understand that this is an accelerated one-month timeline, please know that I'm committed to delivering optimal results within the stipulated timeframe. My dedication to client satisfaction coupled with my reliable service should give you confidence in my capacity to meet the project's requirements proficiently. Let's work together towards improving your DOVE model for enhanced VSR efficiency!
€500 EUR in 1 day
0.0
0.0

Hello, Leveraging over 9 years of experience as a Senior Full Stack & DevOps Engineer, my expertise spans a vast range of significant areas relevant to your project. My deep understanding of the PyTorch optimization framework combined with solid experience in implementing training-free techniques such as Token-level routing, Token Merging (ToMe), and Token Pruning make me the ideal candidate to undertake the task. I am also well-versed in Post-Training Quantization and efficient attention mechanisms, adding to my suitability for this role. Considering your preferred budget and timeline, my proposed training-free pipeline would involve a holistic approach that includes a combination of pruning, quantization, and attention simplification techniques. Starting with an in-depth analysis of the DOVE model repository you provided, I aim to identify potential areas for improvement and execute the required adjustments accordingly. Complementing my technical skills is an unyielding commitment to quality. I understand that your project metric requirements are rather stringent. However, I am confident in my ability to meet them while ensuring that Image Quality Assessment (IQA) metrics and Temporal metrics are preserved or only marginally impacted within the prescribed limits. As an added guarantee, I’m available for any necessary revisions and testings within the stipulated month-long time frame. Let us build a highly optimized DOVE model together! Thanks! Chibike
€555 EUR in 2 days
0.0
0.0

Hey , I just finished reading the job description and I see you are looking for someone experienced in AI Rendering, AI Content Creation, AI Research, AI Development, Deep Learning, AI Image Editing, Machine Learning (ML) and AI Model Development. This is something I can do. Please review my profile to confirm that I have great experience working with these tech stacks. While I have few questions: 1. These are all the requirements? If not, Please share more detailed requirements. 2. Do you currently have anything done for the job or it has to be done from scratch? 3. What is the timeline to get this done? Why Choose Me? 1. I have done more than 250 major projects. 2. I have not received a single bad feedback since the last 5-6 years. 3. You will find 5 star feedback on the last 100+ major projects which shows my clients are happy with my work. Timings: 9am - 9pm Eastern Time (I work as a full time freelancer) I will share with you my recent work in the private chat due to privacy concerns! Please start the chat to discuss it further. Regards, Adil.
€250 EUR in 6 days
0.0
0.0

Hello; We are interested in your Speedup VSR Model with PyTorch Optimization project. We are a professional team of expert architects from all over the world. Our team offers the highest quality and most effective projects with over 14 years of experience and works with a focus on 100% customer satisfaction. When you choose us, you will have a final delivery that exceeds your expectations. We look forward to working with you! Take a look at our past work on our portfolio: https://www.freelancer.com/u/worldarcpart Kind Regards WORLD ARCHITECTURE PARTNERS
€250 EUR in 2 days
0.0
0.0

As a seasoned full stack developer with over 12 years of experience, I am confident that I possess the skills necessary to optimize your DOVE model and successfully achieve the targeted 1.5x - 1.8x inference speedup. In regards to your preference for training-free techniques, my expertise in machine learning and deep learning, combined with my firm grasp of PyTorch optimizations, will enable me to intelligently apply token-level routing, post-training quantization, efficient attention mechanisms, and other appropriate approaches. Throughout my career, I have consistently satisfied clients by providing high-quality solutions within budget and on time. I have a strong understanding of agile methodologies and problem-solving techniques that will assist me in achieving your specific needs. I'm also well-versed in using Hugging Face, TensorFlow, scikit-learn, Keras, and have solid experience with NVIDIA GPUs with extensive VRAM like L40S and H100. Massive thanks for considering my profile for this critical project. I'm prepared to give it my all over the next month to deliver results that not only meets but exceeds your expectations. Let's optimize your DOVE model together and enhance its efficiency while ensuring no compromises on image quality assessments or temporal metrics. I pledge my dedication, skills, and knowledge toward achieving an impressive speedup for your VSR model.
€500 EUR in 7 days
0.0
0.0

Hi, this is Kris from McKinney, Texas, I've reviewed your project requirements and understand that you are seeking an experienced PyTorch optimization expert to accelerate the DOVE (Video Super-Resolution) model by achieving an end-to-end inference speedup of 1.5x to 1.8x using training-free methods. One of the key challenges will be ensuring that the acceleration target is met without compromising the Image Quality Assessment (IQA) metrics and Temporal metrics. My approach to completing this project would involve leveraging a combination of token-level routing, post-training quantization, and attention simplification techniques to optimize the DOVE model. A few additional questions: Q1: Are there any specific IQA metrics or quality constraints that are of utmost importance for this project? Q2: Could you provide more details on the current hardware setup and testing conditions? Q3: Is there a preferred method of communication for updates and feedback during the project duration? Best regards, Kris Kramer
€250 EUR in 3 days
0.0
0.0

Würzburg, Germany
Payment method verified
Member since May 26, 2024
€250-750 EUR
min $50 USD / hour
₹12500-37500 INR
$250-750 CAD
$15-25 USD / hour
$30-250 USD
₹600-1500 INR
$50-100 USD
$250-750 USD
$8-15 USD / hour
$15-25 USD / hour
$250-750 USD
$750-1500 USD
₹1500-12500 INR
₹75000-150000 INR
$250-750 USD
€30-250 EUR
$40-50 USD
$30-250 AUD
$250-750 USD