Torrent details for "Reinforcement Learning Specialization" Log in to bookmark
Controls:
×
Report Torrent
Please select a reason for reporting this torrent:
Your report will be reviewed by our moderation team.
×
Report Information
Loading report information...
This torrent has been reported 0 times.
Report Summary:
| User | Reason | Date |
|---|
Failed to load report information.
×
Success
Your report has been submitted successfully.
Checked by:
Category:
Language:
English
Total Size:
4.6 GB
Info Hash:
E00A4FC3F94EF3FF923884F09A47FFF540D7EE60
Added By:
Added:
July 18, 2023, 7:02 p.m.
Stats:
|
(Last updated: May 22, 2025, 4:57 p.m.)
| File | Size |
|---|---|
| 04_warren-powell-approximate-dynamic-programming-for-fleet-management-long.mp4 | 145.3 MB |
| TutsNode.net.txt | 63 bytes |
| 01_sequential-decision-making_quiz.html | 210.3 KB |
| 01_course-4-introduction.en.txt | 2.3 KB |
| 01_dynamic-programming_quiz.html | 157.5 KB |
| 04_read-me-pre-requisites-and-learning-objectives_Course_2__Sample_Based_Learning_Methods_Learning_Objectives.pdf | 83.1 KB |
| 06_read-me-pre-requisites-and-learning-objectives_Fundamentals_of_Reinforcement_Learning__Learning_Objectives.pdf | 64.7 KB |
| 03_read-me-pre-requisites-and-learning-objectives_Prediction_and_Control_with_Function_Approximation_Learning_Objectives.pdf | 59.9 KB |
| 03_reinforcement-learning-textbook_instructions.html | 2.2 KB |
| 04_pre-requisites-and-learning-objectives_A_Complete_Reinforcement_Learning_System_Capstone__Learning_Objectives.pdf | 56.8 KB |
| 04_emma-brunskill-batch-reinforcement-learning.en.srt | 24.9 KB |
| 02_course-introduction.en.txt | 5.6 KB |
| 0 | 14 bytes |
| 05_reinforcement-learning-textbook_RLbook2018.pdf | 85.3 MB |
| 04_warren-powell-approximate-dynamic-programming-for-fleet-management-long.en.srt | 40.7 KB |
| 02_graded-value-functions-and-bellman-equations_exam.html | 31.1 KB |
| 04_warren-powell-approximate-dynamic-programming-for-fleet-management-long.en.txt | 21.3 KB |
| 02_satinder-singh-on-intrinsic-rewards.en.srt | 21.0 KB |
| 02_michael-littman-the-reward-hypothesis.en.srt | 18.5 KB |
| 03_andy-barto-and-rich-sutton-more-on-the-history-of-rl.en.srt | 15.9 KB |
| 03_meet-your-instructors.en.srt | 15.9 KB |
| 03_lets-review-average-reward-a-new-way-of-formulating-control-problems.en.srt | 15.2 KB |
| 02_lets-review-examples-of-episodic-and-continuing-tasks.en.txt | 2.5 KB |
| 01_average-reward-a-new-way-of-formulating-control-problems.en.srt | 15.2 KB |
| 03_david-silver-on-deep-learning-rl-ai.en.srt | 14.7 KB |
| 01_meeting-with-niko-choosing-the-learning-algorithm.en.txt | 2.8 KB |
| 04_jonathan-langford-contextual-bandits-for-real-world-reinforcement-learning.en.srt | 14.0 KB |
| 01_gradient-descent-for-training-neural-networks.en.srt | 14.0 KB |
| 01_lets-review-expected-sarsa.en.txt | 2.8 KB |
| 04_iterative-policy-evaluation.en.srt | 13.7 KB |
| 02_joelle-pineau-about-rl-that-matters.en.srt | 13.7 KB |
| 02_lets-review-what-is-q-learning.en.txt | 2.6 KB |
| 02_meet-your-instructors.en.srt | 13.4 KB |
| 02_meet-your-instructors.en.srt | 13.4 KB |
| 02_meet-your-instructors.en.srt | 13.4 KB |
| 02_policy-iteration.en.srt | 13.3 KB |
| 04_emma-brunskill-batch-reinforcement-learning.en.txt | 13.2 KB |
| 03_gaussian-policies-for-continuous-actions.en.srt | 12.8 KB |
| 02_andy-barto-on-what-are-eligibility-traces-and-why-are-they-so-named.en.srt | 12.5 KB |
| 01_what-is-the-trade-off.en.srt | 12.2 KB |
| 01_optimal-policies.en.srt | 12.2 KB |
| 03_warren-powell-approximate-dynamic-programming-for-fleet-management-short.en.srt | 12.1 KB |
| 01_mdps_quiz.html | 11.8 KB |
| 02_michael-littman-the-reward-hypothesis.en.txt | 11.6 KB |
| 03_doina-precup-building-knowledge-for-ai-agents-with-reinforcement-learning.en.srt | 11.3 KB |
| 04_rich-sutton-the-importance-of-td-learning.en.srt | 11.2 KB |
| 02_satinder-singh-on-intrinsic-rewards.en.txt | 11.0 KB |
| 02_demonstration-with-actor-critic.en.srt | 10.9 KB |
| 03_using-optimal-value-functions-to-get-optimal-policies.en.srt | 10.8 KB |
| 01_agent-architecture-meeting-with-martha-overview-of-design-choices.en.srt | 10.8 KB |
| 04_using-monte-carlo-for-prediction.en.srt | 10.6 KB |
| 03_what-is-monte-carlo.en.srt | 10.5 KB |
| 02_drew-bagnell-on-system-id-optimal-control.en.srt | 10.5 KB |
| 05_rich-sutton-and-andy-barto-a-brief-history-of-rl.en.srt | 10.5 KB |
| 03_moving-to-parameterized-functions.en.srt | 10.4 KB |
| 02_course-introduction.en.srt | 10.4 KB |
| 03_susan-murphy-on-rl-in-mobile-health.en.srt | 10.4 KB |
| 04_value-functions.en.srt | 10.3 KB |
| 03_drew-bagnell-self-driving-robotics-and-model-based-rl.en.srt | 10.3 KB |
| 04_state-aggregation-with-monte-carlo.en.srt | 10.2 KB |
| 03_learning-policies-directly.en.srt | 10.2 KB |
| 02_introducing-gradient-descent.en.srt | 9.9 KB |
| 01_bellman-equation-derivation.en.srt | 9.6 KB |
| 03_markov-decision-processes.en.srt | 9.6 KB |
| 02_lets-review-expected-sarsa-with-function-approximation.en.txt | 2.1 KB |
| 01_lets-review-markov-decision-processes.en.srt | 9.6 KB |
| 05_csaba-szepesvari-on-problem-landscape.en.srt | 9.6 KB |
| 03_david-silver-on-deep-learning-rl-ai.en.txt | 9.5 KB |
| 01_average-reward-a-new-way-of-formulating-control-problems.en.txt | 9.4 KB |
| 03_lets-review-average-reward-a-new-way-of-formulating-control-problems.en.txt | 9.4 KB |
| 04_generalization-properties-of-coarse-coding.en.srt | 9.4 KB |
| 03_gradient-monte-for-policy-evaluation.en.srt | 9.3 KB |
| 02_the-policy-gradient-theorem.en.srt | 9.3 KB |
| 02_in-depth-with-changing-environments.en.srt | 9.2 KB |
| 02_actor-critic-algorithm.en.srt | 9.2 KB |
| 02_weekly-reading_instructions.html | 1.2 KB |
| 1 | 3 bytes |
| 02_weekly-reading_RLbook2018.pdf | 85.3 MB |
| 04_lets-review-actor-critic-algorithm.en.srt | 9.2 KB |
| 01_the-objective-for-learning-policies.en.srt | 8.9 KB |
| 01_course-3-introduction.en.srt | 8.9 KB |
| 02_joelle-pineau-about-rl-that-matters.en.txt | 8.8 KB |
| 03_sequential-decision-making-with-evaluative-feedback.en.srt | 8.7 KB |
| 04_generalization-and-discrimination.en.srt | 8.7 KB |
| 04_episodic-sarsa-in-mountain-car.en.srt | 8.7 KB |
| 01_meeting-with-martha-discussing-your-results.en.txt | 2.4 KB |
| 02_meet-your-instructors.en.txt | 8.6 KB |
| 02_course-wrap-up.en.srt | 3.0 KB |
| 02_course-wrap-up.en.txt | 1.8 KB |
| 02_meet-your-instructors.en.txt | 8.6 KB |
| 02_meet-your-instructors.en.txt | 8.6 KB |
| 02_optimistic-initial-values.en.srt | 8.5 KB |
| 03_meet-your-instructors.en.txt | 8.4 KB |
| 02_optimization-strategies-for-nns.en.srt | 8.4 KB |
| 01_specialization-introduction.en.txt | 2.6 KB |
| 01_lets-review-optimization-strategies-for-nns.en.srt | 8.4 KB |
| 02_optimal-value-functions.en.srt | 8.3 KB |
| 06_using-tile-coding-in-td.en.srt | 8.3 KB |
| 02_the-true-objective-for-td.en.srt | 8.2 KB |
| 05_martin-riedmiller-on-the-collect-and-infer-framework-for-data-efficient-rl.en.srt | 8.2 KB |
| 01_the-advantages-of-temporal-difference-learning.en.srt | 8.2 KB |
| 01_lets-review-comparing-td-and-monte-carlo.en.srt | 8.1 KB |
| 02_comparing-td-and-monte-carlo.en.srt | 8.1 KB |
| 01_meeting-with-adam-parameter-studies-in-rl.en.srt | 8.1 KB |
| 03_andy-barto-and-rich-sutton-more-on-the-history-of-rl.en.txt | 8.1 KB |
| 05_reinforcement-learning-textbook_instructions.html | 2.2 KB |
| 02_estimating-action-values-incrementally.en.srt | 8.1 KB |
| 01_practice-value-functions-and-bellman-equations_quiz.html | 8.0 KB |
| 06_read-me-pre-requisites-and-learning-objectives_instructions.html | 2.6 KB |
| 01_module-1-learning-objectives_instructions.html | 2.8 KB |
| 02_weekly-reading_instructions.html | 1.2 KB |
| 02_the-dyna-algorithm.en.srt | 7.8 KB |
| 03_off-policy-monte-carlo-prediction.en.srt | 7.8 KB |
| 03_what-is-temporal-difference-td-learning.en.srt | 7.8 KB |
| 02_efficiency-of-dynamic-programming.en.srt | 7.7 KB |
| 04_advantages-of-policy-parameterization.en.srt | 7.7 KB |
| 01_continuing-tasks.en.srt | 7.6 KB |
| 01_epsilon-soft-policies.en.srt | 7.5 KB |
| 03_upper-confidence-bound-ucb-action-selection.en.srt | 7.5 KB |
| 03_what-is-a-model.en.srt | 7.5 KB |
| 03_specifying-policies.en.srt | 7.5 KB |
| 03_warren-powell-approximate-dynamic-programming-for-fleet-management-short.en.txt | 7.5 KB |
| 01_estimating-the-policy-gradient.en.srt | 7.5 KB |
| 01_gradient-descent-for-training-neural-networks.en.txt | 7.5 KB |
| 04_meeting-with-martha-in-depth-on-experience-replay.en.srt | 7.4 KB |
| 04_jonathan-langford-contextual-bandits-for-real-world-reinforcement-learning.en.txt | 7.3 KB |
| 03_how-is-q-learning-off-policy.en.srt | 7.2 KB |
| 04_iterative-policy-evaluation.en.txt | 7.2 KB |
| 02_policy-iteration.en.txt | 7.1 KB |
| 01_meeting-with-adam-getting-the-agent-details-right.en.srt | 7.1 KB |
| 01_flexibility-of-the-policy-iteration-framework.en.srt | 7.1 KB |
| 01_what-if-the-model-is-inaccurate.en.srt | 7.0 KB |
| 04_week-4-summary.en.srt | 7.0 KB |
| 02_why-bellman-equations.en.srt | 7.0 KB |
| 05_week-1-summary.en.txt | 2.7 KB |
| 01_learning-action-values.en.srt | 7.0 KB |
| 06_chapter-summary_instructions.html | 1.2 KB |
| 03_gaussian-policies-for-continuous-actions.en.txt | 6.9 KB |
| 01_the-dyna-architecture.en.srt | 6.9 KB |
| 02_bandits-and-exploration-exploitation_instructions.html | 1.1 KB |
| 01_module-2-learning-objectives_instructions.html | 2.4 KB |
| 02_weekly-reading_instructions.html | 1.2 KB |
| 03_lets-review-dyna-q-learning-in-a-simple-maze.en.srt | 6.9 KB |
| 03_dyna-q-learning-in-a-simple-maze.en.srt | 6.9 KB |
| 02_comparing-td-and-monte-carlo-with-state-aggregation.en.srt | 6.9 KB |
| 04_examples-of-mdps.en.srt | 6.8 KB |
| 02_drew-bagnell-on-system-id-optimal-control.en.txt | 6.8 KB |
| 01_initial-project-meeting-with-martha-formalizing-the-problem.en.srt | 6.8 KB |
| 03_using-optimal-value-functions-to-get-optimal-policies.en.txt | 6.7 KB |
| 03_week-1-summary.en.srt | 6.7 KB |
| 01_the-goal-of-reinforcement-learning.en.txt | 2.6 KB |
| 03_drew-bagnell-self-driving-robotics-and-model-based-rl.en.txt | 6.7 KB |
| 03_policy-evaluation-vs-control.en.srt | 6.7 KB |
| 04_your-specialization-roadmap.en.srt | 6.6 KB |
| 02_andy-barto-on-what-are-eligibility-traces-and-why-are-they-so-named.en.txt | 6.6 KB |
| 02_importance-sampling.en.srt | 6.6 KB |
| 01_what-is-the-trade-off.en.txt | 6.6 KB |
| 01_exploration-under-function-approximation.en.srt | 6.5 KB |
| 01_policy-improvement.en.srt | 6.5 KB |
| 02_examples-of-episodic-and-continuing-tasks.en.txt | 2.5 KB |
| 03_solving-the-blackjack-example.en.srt | 6.5 KB |
| 03_week-2-summary.en.srt | 2.8 KB |
| 03_week-2-summary.en.txt | 1.5 KB |
| 01_optimal-policies.en.txt | 6.4 KB |
| 04_week-3-summary.en.srt | 6.4 KB |
| 02_graded-assignment-describe-three-mdps_peer_assignment_instructions.html | 2.3 KB |
| 01_congratulations.en.srt | 6.3 KB |
| 03_susan-murphy-on-rl-in-mobile-health.en.txt | 6.3 KB |
| 01_the-linear-td-update.en.srt | 6.3 KB |
| 05_framing-value-estimation-as-supervised-learning.en.srt | 6.3 KB |
| 01_the-value-error-objective.en.srt | 6.2 KB |
| 04_state-aggregation-with-monte-carlo.en.txt | 6.2 KB |
| 03_episodic-sarsa-with-function-approximation.en.srt | 6.2 KB |
| 02_introducing-gradient-descent.en.txt | 6.2 KB |
| 03_sarsa-gpi-with-td.en.srt | 6.1 KB |
| 01_lets-review-non-linear-approximation-with-neural-networks.en.srt | 6.1 KB |
| 02_non-linear-approximation-with-neural-networks.en.srt | 6.1 KB |
| 03_doina-precup-building-knowledge-for-ai-agents-with-reinforcement-learning.en.txt | 6.1 KB |
| 05_csaba-szepesvari-on-problem-landscape.en.txt | 6.1 KB |
| 01_actor-critic-with-softmax-policies.en.srt | 6.0 KB |
| 01_why-does-off-policy-learning-matter.en.srt | 5.9 KB |
| 04_rich-sutton-the-importance-of-td-learning.en.txt | 5.9 KB |
| 03_deep-neural-networks.en.srt | 5.9 KB |
| 02_demonstration-with-actor-critic.en.txt | 5.9 KB |
| 06_andy-and-rich-advice-for-students.en.srt | 5.8 KB |
| 01_agent-architecture-meeting-with-martha-overview-of-design-choices.en.txt | 5.8 KB |
| 02_q-learning-in-the-windy-grid-world.en.srt | 5.8 KB |
| 03_what-is-monte-carlo.en.txt | 5.6 KB |
| 04_using-monte-carlo-for-prediction.en.txt | 5.6 KB |
| 05_week-1-summary.en.srt | 5.6 KB |
| 03_moving-to-parameterized-functions.en.txt | 5.6 KB |
| 04_value-functions.en.txt | 5.5 KB |
| 01_what-is-a-neural-network.en.srt | 5.5 KB |
| 01__resources.html | 5.5 KB |
| 2 | 275 bytes |
| 05_chapter-summary_RLbook2018.pdf | 85.3 MB |
| 05_chapter-summary_instructions.html | 1.1 KB |
| 03_learning-policies-directly.en.txt | 5.4 KB |
| 03_specialization-wrap-up.en.srt | 5.4 KB |
| 05_rich-sutton-and-andy-barto-a-brief-history-of-rl.en.txt | 5.4 KB |
| 01_module-4-learning-objectives_instructions.html | 3.0 KB |
| 02_weekly-reading_instructions.html | 1.2 KB |
| 01_random-tabular-q-planning.en.srt | 5.4 KB |
| 02_optimistic-initial-values.en.txt | 5.4 KB |
| 03_markov-decision-processes.en.txt | 5.2 KB |
| 01_lets-review-markov-decision-processes.en.txt | 5.2 KB |
| 05_tile-coding.en.srt | 5.2 KB |
| 01_bellman-equation-derivation.en.txt | 5.1 KB |
| 05_martin-riedmiller-on-the-collect-and-infer-framework-for-data-efficient-rl.en.txt | 5.1 KB |
| 01_meeting-with-adam-parameter-studies-in-rl.en.txt | 5.1 KB |
| 04_generalization-properties-of-coarse-coding.en.txt | 5.0 KB |
| 02_lets-review-what-is-q-learning.en.srt | 4.9 KB |
| 01_what-is-q-learning.en.srt | 4.9 KB |
| 02_in-depth-with-changing-environments.en.txt | 4.9 KB |
| 01_specialization-introduction.en.srt | 4.9 KB |
| 02_the-policy-gradient-theorem.en.txt | 4.9 KB |
| 02_actor-critic-algorithm.en.txt | 4.9 KB |
| 04_lets-review-actor-critic-algorithm.en.txt | 4.9 KB |
| 03_gradient-monte-for-policy-evaluation.en.txt | 4.9 KB |
| 01_the-goal-of-reinforcement-learning.en.srt | 4.9 KB |
| 03_coarse-coding.en.srt | 4.9 KB |
| 02_efficiency-of-dynamic-programming.en.txt | 4.8 KB |
| 01_epsilon-soft-policies.en.txt | 4.8 KB |
| 01_the-objective-for-learning-policies.en.txt | 4.7 KB |
| 01_using-monte-carlo-for-action-values.en.srt | 4.7 KB |
| 04_advantages-of-policy-parameterization.en.txt | 4.7 KB |
| [TGx]Downloaded from torrentgalaxy.to .txt | 585 bytes |
| 01_course-3-introduction.en.txt | 4.7 KB |
| 05_week-4-summary.en.txt | 2.4 KB |
| 02_lets-review-examples-of-episodic-and-continuing-tasks.en.srt | 4.7 KB |
| 06_chapter-summary_instructions.html | 1.2 KB |
| 02_examples-of-episodic-and-continuing-tasks.en.srt | 4.7 KB |
| 03_week-3-review.en.srt | 4.7 KB |
| 02_optimal-policies-with-dynamic-programming_instructions.html | 1.1 KB |
| 04_meeting-with-martha-in-depth-on-experience-replay.en.txt | 4.7 KB |
| 03_sequential-decision-making-with-evaluative-feedback.en.txt | 4.6 KB |
| 04_episodic-sarsa-in-mountain-car.en.txt | 4.6 KB |
| 01_estimating-the-policy-gradient.en.txt | 4.6 KB |
| 04_generalization-and-discrimination.en.txt | 4.6 KB |
| 01_semi-gradient-td-for-policy-evaluation.en.srt | 4.6 KB |
| 01_meeting-with-niko-choosing-the-learning-algorithm.en.srt | 4.6 KB |
| 01_lets-review-optimization-strategies-for-nns.en.txt | 4.5 KB |
| 02_optimization-strategies-for-nns.en.txt | 4.5 KB |
| 01_lets-review-expected-sarsa.en.srt | 4.5 KB |
| 01_expected-sarsa.en.srt | 4.5 KB |
| 04_reinforcement-learning-textbook_instructions.html | 2.2 KB |
| 02_optimal-value-functions.en.txt | 4.5 KB |
| 05_week-4-summary.en.srt | 4.5 KB |
| 02_weekly-reading-on-policy-prediction-with-approximation_instructions.html | 1.2 KB |
| 02_why-bellman-equations.en.txt | 4.4 KB |
| 01_meeting-with-adam-getting-the-agent-details-right.en.txt | 4.4 KB |
| 05_week-1-summary.en.srt | 4.3 KB |
| 06_using-tile-coding-in-td.en.txt | 4.3 KB |
| 02_the-true-objective-for-td.en.txt | 4.3 KB |
| 01_the-advantages-of-temporal-difference-learning.en.txt | 4.3 KB |
| 02_estimating-action-values-incrementally.en.txt | 4.3 KB |
| 01_the-dyna-architecture.en.txt | 4.3 KB |
| 01_lets-review-comparing-td-and-monte-carlo.en.txt | 4.3 KB |
| 02_comparing-td-and-monte-carlo.en.txt | 4.3 KB |
| 01_course-4-introduction.en.srt | 4.2 KB |
| 01_congratulations-course-4-preview.en.srt | 4.2 KB |
| 01_initial-project-meeting-with-martha-formalizing-the-problem.en.txt | 4.2 KB |
| 03_lets-review-dyna-q-learning-in-a-simple-maze.en.txt | 4.2 KB |
| 03_dyna-q-learning-in-a-simple-maze.en.txt | 4.2 KB |
| 03_policy-evaluation-vs-control.en.txt | 4.2 KB |
| 02_the-dyna-algorithm.en.txt | 4.2 KB |
| 03_off-policy-monte-carlo-prediction.en.txt | 4.1 KB |
| 03_what-is-temporal-difference-td-learning.en.txt | 4.1 KB |
| 04_week-2-review.en.srt | 4.1 KB |
| 03_upper-confidence-bound-ucb-action-selection.en.txt | 4.0 KB |
| 02_using-monte-carlo-methods-for-generalized-policy-iteration.en.srt | 4.0 KB |
| 03_specifying-policies.en.txt | 4.0 KB |
| 01_semi-gradient-td-for-policy-evaluation.en.txt | 2.9 KB |
| 01_course-introduction.en.srt | 4.0 KB |
| 01_continuing-tasks.en.txt | 4.0 KB |
| 03_how-is-q-learning-off-policy.en.txt | 4.0 KB |
| 03_what-is-a-model.en.txt | 4.0 KB |
| 01_module-1-learning-objectives_instructions.html | 4.0 KB |
| 05_expected-sarsa-with-function-approximation.en.srt | 3.9 KB |
| 02_lets-review-expected-sarsa-with-function-approximation.en.srt | 3.9 KB |
| 01_meeting-with-martha-discussing-your-results.en.srt | 3.9 KB |
| 04_sarsa-in-the-windy-grid-world.en.srt | 3.9 KB |
| 04_comparing-sample-and-distribution-models.en.srt | 3.9 KB |
| 03_episodic-sarsa-with-function-approximation.en.txt | 3.9 KB |
| 02_non-linear-approximation-with-neural-networks.en.txt | 3.9 KB |
| 01_lets-review-non-linear-approximation-with-neural-networks.en.txt | 3.9 KB |
| 01_why-does-off-policy-learning-matter.en.txt | 3.8 KB |
| 01_flexibility-of-the-policy-iteration-framework.en.txt | 3.8 KB |
| 01_learning-action-values.en.txt | 3.8 KB |
| 01_what-if-the-model-is-inaccurate.en.txt | 3.8 KB |
| 02_weekly-reading-on-policy-prediction-with-approximation-ii_instructions.html | 1.2 KB |
| 02_expected-sarsa-in-the-cliff-world.en.srt | 3.7 KB |
| 01_actor-critic-with-softmax-policies.en.txt | 3.7 KB |
| 04_examples-of-mdps.en.txt | 3.7 KB |
| 04_week-4-summary.en.txt | 3.7 KB |
| 02_comparing-td-and-monte-carlo-with-state-aggregation.en.txt | 3.6 KB |
| 04_pre-requisites-and-learning-objectives_instructions.html | 3.6 KB |
| 03_week-1-summary.en.txt | 3.6 KB |
| 01_module-4-learning-objectives_instructions.html | 3.5 KB |
| 05_tile-coding.en.txt | 2.8 KB |
| 06_andy-and-rich-advice-for-students.en.txt | 3.5 KB |
| 01_exploration-under-function-approximation.en.txt | 3.5 KB |
| 01_policy-improvement.en.txt | 3.5 KB |
| 02_importance-sampling.en.txt | 3.5 KB |
| 04_your-specialization-roadmap.en.txt | 3.5 KB |
| 01_what-is-a-neural-network.en.txt | 3.0 KB |
| 01_the-value-error-objective.en.txt | 3.4 KB |
| 03_solving-the-blackjack-example.en.txt | 3.4 KB |
| 01_congratulations.en.txt | 3.4 KB |
| 03_specialization-wrap-up.en.txt | 3.4 KB |
| 04_week-3-summary.en.txt | 3.4 KB |
| 05_framing-value-estimation-as-supervised-learning.en.txt | 3.3 KB |
| 01_congratulations.en.srt | 3.3 KB |
| 01_the-linear-td-update.en.txt | 3.3 KB |
| 03_sarsa-gpi-with-td.en.txt | 3.2 KB |
| 01_module-3-learning-objectives_instructions.html | 3.2 KB |
| 03_read-me-pre-requisites-and-learning-objectives_instructions.html | 3.2 KB |
| 03_deep-neural-networks.en.txt | 3.2 KB |
| 04_week-2-summary.en.srt | 3.1 KB |
| 01_module-2-learning-objectives_instructions.html | 3.1 KB |
| 02_q-learning-in-the-windy-grid-world.en.txt | 3.0 KB |
| 01_module-1-learning-objectives_instructions.html | 3.0 KB |
| 03_coarse-coding.en.txt | 3.0 KB |
| 04_week-2-review.en.txt | 2.2 KB |
| 04_read-me-pre-requisites-and-learning-objectives_instructions.html | 3.0 KB |
| 01_module-3-learning-objectives_instructions.html | 2.2 KB |
| 02_weekly-reading-on-policy-control-with-approximation_instructions.html | 1.3 KB |
| 05_week-1-summary.en.txt | 3.0 KB |
| 01_random-tabular-q-planning.en.txt | 2.9 KB |
| 03_generality-of-expected-sarsa.en.srt | 2.9 KB |
| 01_module-4-learning-objectives_instructions.html | 2.9 KB |
| 01_module-3-learning-objectives_instructions.html | 2.8 KB |
| 01_expected-sarsa.en.txt | 2.8 KB |
| 04_week-3-summary.en.srt | 2.7 KB |
| 01_what-is-q-learning.en.txt | 2.6 KB |
| 05_expected-sarsa-with-function-approximation.en.txt | 2.1 KB |
| 04_week-4-summary.en.srt | 2.6 KB |
| 01_using-monte-carlo-for-action-values.en.txt | 2.5 KB |
| 03_week-3-review.en.txt | 2.5 KB |
| 04_sarsa-in-the-windy-grid-world.en.txt | 2.4 KB |
| 02_expected-sarsa-in-the-cliff-world.en.txt | 2.3 KB |
| 01_congratulations-course-4-preview.en.txt | 2.3 KB |
| 03_reinforcement-learning-textbook_instructions.html | 2.2 KB |
| 01_course-introduction.en.txt | 2.1 KB |
| 02_using-monte-carlo-methods-for-generalized-policy-iteration.en.txt | 2.1 KB |
| 04_comparing-sample-and-distribution-models.en.txt | 2.1 KB |
| 01_congratulations.en.txt | 2.1 KB |
| 01_module-2-learning-objectives_instructions.html | 1.7 KB |
| 04_week-2-summary.en.txt | 1.7 KB |
| 04_week-3-summary.en.txt | 1.6 KB |
| 02_weekly-reading-policy-gradient-methods_instructions.html | 1.2 KB |
| 03_generality-of-expected-sarsa.en.txt | 1.5 KB |
| 04_week-4-summary.en.txt | 1.4 KB |
| 05_chapter-summary_instructions.html | 1.2 KB |
| 06_text-book-part-1-summary_instructions.html | 1.2 KB |
| 06_chapter-summary_instructions.html | 1.2 KB |
| 02_weekly-reading_instructions.html | 1.2 KB |
| 02_weekly-reading_instructions.html | 1.2 KB |
| 02_weekly-reading_instructions.html | 1.2 KB |
| 05_chapter-summary_instructions.html | 1.2 KB |
| 02_weekly-reading_instructions.html | 1.2 KB |
| 3 | 155.0 KB |
| 06_chapter-summary_RLbook2018.pdf | 85.3 MB |
| 4 | 734.2 KB |
| 03_reinforcement-learning-textbook_RLbook2018.pdf | 85.3 MB |
| 5 | 734.2 KB |
| 02_weekly-reading_RLbook2018.pdf | 85.3 MB |
| 6 | 734.2 KB |
| 04_reinforcement-learning-textbook_RLbook2018.pdf | 85.3 MB |
| 7 | 734.2 KB |
| 02_weekly-reading_RLbook2018.pdf | 85.3 MB |
| 8 | 734.2 KB |
| 03_reinforcement-learning-textbook_RLbook2018.pdf | 85.3 MB |
| 9 | 734.2 KB |
| 02_weekly-reading-on-policy-control-with-approximation_RLbook2018.pdf | 85.3 MB |
| 10 | 734.2 KB |
| 06_chapter-summary_RLbook2018.pdf | 85.3 MB |
| 11 | 734.2 KB |
| 02_weekly-reading_RLbook2018.pdf | 85.3 MB |
| 12 | 734.2 KB |
| 02_weekly-reading_RLbook2018.pdf | 85.3 MB |
| 13 | 734.2 KB |
| 02_weekly-reading_RLbook2018.pdf | 85.3 MB |
| 14 | 734.2 KB |
| 02_weekly-reading-on-policy-prediction-with-approximation-ii_RLbook2018.pdf | 85.3 MB |
| 15 | 734.2 KB |
| 02_weekly-reading-policy-gradient-methods_RLbook2018.pdf | 85.3 MB |
| 16 | 734.2 KB |
| 05_chapter-summary_RLbook2018.pdf | 85.3 MB |
| 17 | 734.2 KB |
| 02_weekly-reading_RLbook2018.pdf | 85.3 MB |
| 18 | 734.2 KB |
| 02_weekly-reading_RLbook2018.pdf | 85.3 MB |
| 19 | 734.2 KB |
| 05_chapter-summary_RLbook2018.pdf | 85.3 MB |
| 20 | 734.2 KB |
| 06_text-book-part-1-summary_RLbook2018.pdf | 85.3 MB |
| 21 | 734.2 KB |
| 06_chapter-summary_RLbook2018.pdf | 85.3 MB |
| 22 | 734.2 KB |
| 02_weekly-reading-on-policy-prediction-with-approximation_RLbook2018.pdf | 85.3 MB |
| 23 | 734.2 KB |
| 02_michael-littman-the-reward-hypothesis.mp4 | 84.0 MB |
| 24 | 1014.7 KB |
| 03_andy-barto-and-rich-sutton-more-on-the-history-of-rl.mp4 | 80.2 MB |
| 25 | 813.0 KB |
| 03_doina-precup-building-knowledge-for-ai-agents-with-reinforcement-learning.mp4 | 55.3 MB |
| 26 | 729.9 KB |
| 05_rich-sutton-and-andy-barto-a-brief-history-of-rl.mp4 | 48.7 MB |
| 27 | 259.0 KB |
| 03_warren-powell-approximate-dynamic-programming-for-fleet-management-short.mp4 | 47.1 MB |
| 28 | 895.8 KB |
| 03_meet-your-instructors.mp4 | 43.9 MB |
| 29 | 137.6 KB |
| 02_meet-your-instructors.mp4 | 43.9 MB |
| 30 | 137.6 KB |
| 02_meet-your-instructors.mp4 | 43.9 MB |
| 31 | 137.6 KB |
| 02_meet-your-instructors.mp4 | 43.9 MB |
| 32 | 137.6 KB |
| 03_david-silver-on-deep-learning-rl-ai.mp4 | 41.4 MB |
| 33 | 601.5 KB |
| 05_csaba-szepesvari-on-problem-landscape.mp4 | 38.8 MB |
| 34 | 198.0 KB |
| 02_andy-barto-on-what-are-eligibility-traces-and-why-are-they-so-named.mp4 | 38.5 MB |
| 35 | 503.7 KB |
| 04_emma-brunskill-batch-reinforcement-learning.mp4 | 37.4 MB |
| 36 | 629.9 KB |
| 04_rich-sutton-the-importance-of-td-learning.mp4 | 35.6 MB |
| 37 | 363.3 KB |
| 03_drew-bagnell-self-driving-robotics-and-model-based-rl.mp4 | 35.2 MB |
| 38 | 804.8 KB |
| 06_andy-and-rich-advice-for-students.mp4 | 33.4 MB |
| 39 | 625.2 KB |
| 02_course-introduction.mp4 | 32.4 MB |
| 40 | 622.0 KB |
| 02_drew-bagnell-on-system-id-optimal-control.mp4 | 31.3 MB |
| 41 | 730.9 KB |
| 02_joelle-pineau-about-rl-that-matters.mp4 | 29.5 MB |
| 42 | 515.0 KB |
| 02_demonstration-with-actor-critic.mp4 | 28.8 MB |
| 43 | 189.4 KB |
| 03_susan-murphy-on-rl-in-mobile-health.mp4 | 27.6 MB |
| 44 | 376.2 KB |
| 02_satinder-singh-on-intrinsic-rewards.mp4 | 26.9 MB |
| 45 | 90.4 KB |
| 04_advantages-of-policy-parameterization.mp4 | 26.1 MB |
| 46 | 966.6 KB |
| 03_moving-to-parameterized-functions.mp4 | 24.4 MB |
| 47 | 636.5 KB |
| 05_martin-riedmiller-on-the-collect-and-infer-framework-for-data-efficient-rl.mp4 | 23.5 MB |
| 48 | 471.8 KB |
| 06_using-tile-coding-in-td.mp4 | 23.1 MB |
| 49 | 952.7 KB |
| 01_congratulations-course-4-preview.mp4 | 22.1 MB |
| 50 | 908.7 KB |
| 01_course-4-introduction.mp4 | 22.1 MB |
| 51 | 908.7 KB |
| 01_what-is-the-trade-off.mp4 | 21.6 MB |
| 52 | 425.8 KB |
| 04_meeting-with-martha-in-depth-on-experience-replay.mp4 | 21.4 MB |
| 53 | 593.0 KB |
| 04_value-functions.mp4 | 21.1 MB |
| 54 | 925.0 KB |
| 04_state-aggregation-with-monte-carlo.mp4 | 20.3 MB |
| 55 | 762.6 KB |
| 03_gaussian-policies-for-continuous-actions.mp4 | 20.0 MB |
| 56 | 50.2 KB |
| 02_estimating-action-values-incrementally.mp4 | 19.4 MB |
| 57 | 616.2 KB |
| 01_average-reward-a-new-way-of-formulating-control-problems.mp4 | 19.1 MB |
| 58 | 945.5 KB |
| 03_lets-review-average-reward-a-new-way-of-formulating-control-problems.mp4 | 19.1 MB |
| 59 | 945.5 KB |
| 04_iterative-policy-evaluation.mp4 | 18.8 MB |
| 60 | 216.9 KB |
| 03_specialization-wrap-up.mp4 | 18.6 MB |
| 61 | 385.0 KB |
| 01_optimal-policies.mp4 | 18.5 MB |
| 62 | 548.4 KB |
| 01_specialization-introduction.mp4 | 18.3 MB |
| 63 | 760.1 KB |
| 03_episodic-sarsa-with-function-approximation.mp4 | 18.1 MB |
| 64 | 969.1 KB |
| 04_generalization-properties-of-coarse-coding.mp4 | 18.0 MB |
| 65 | 25.3 KB |
| 02_policy-iteration.mp4 | 17.9 MB |
| 66 | 138.6 KB |
| 03_learning-policies-directly.mp4 | 17.1 MB |
| 67 | 917.1 KB |
| 01_bellman-equation-derivation.mp4 | 17.0 MB |
| 68 | 994.2 KB |
| 03_using-optimal-value-functions-to-get-optimal-policies.mp4 | 16.7 MB |
| 69 | 275.1 KB |
| 01_actor-critic-with-softmax-policies.mp4 | 16.5 MB |
| 70 | 480.3 KB |
| 01_course-3-introduction.mp4 | 16.3 MB |
| 71 | 686.8 KB |
| 03_week-1-summary.mp4 | 16.3 MB |
| 72 | 701.7 KB |
| 03_sequential-decision-making-with-evaluative-feedback.mp4 | 16.3 MB |
| 73 | 742.8 KB |
| 04_using-monte-carlo-for-prediction.mp4 | 16.2 MB |
| 74 | 845.0 KB |
| 01_agent-architecture-meeting-with-martha-overview-of-design-choices.mp4 | 15.6 MB |
| 75 | 387.4 KB |
| 01_gradient-descent-for-training-neural-networks.mp4 | 15.5 MB |
| 76 | 477.0 KB |
| 04_episodic-sarsa-in-mountain-car.mp4 | 15.5 MB |
| 77 | 541.7 KB |
| 01_semi-gradient-td-for-policy-evaluation.mp4 | 15.3 MB |
| 78 | 666.0 KB |
| 03_deep-neural-networks.mp4 | 15.3 MB |
| 79 | 686.1 KB |
| 03_gradient-monte-for-policy-evaluation.mp4 | 15.2 MB |
| 80 | 781.5 KB |
| 02_introducing-gradient-descent.mp4 | 15.1 MB |
| 81 | 926.2 KB |
| 03_specifying-policies.mp4 | 15.0 MB |
| 82 | 12.4 KB |
| 04_your-specialization-roadmap.mp4 | 14.9 MB |
| 83 | 119.1 KB |
| 03_what-is-monte-carlo.mp4 | 14.9 MB |
| 84 | 123.3 KB |
| 01_why-does-off-policy-learning-matter.mp4 | 14.4 MB |
| 85 | 622.0 KB |
| 01_lets-review-optimization-strategies-for-nns.mp4 | 14.3 MB |
| 86 | 734.3 KB |
| 02_optimization-strategies-for-nns.mp4 | 14.3 MB |
| 87 | 734.3 KB |
| 01_learning-action-values.mp4 | 14.2 MB |
| 88 | 802.9 KB |
| 02_actor-critic-algorithm.mp4 | 14.1 MB |
| 89 | 949.5 KB |
| 04_lets-review-actor-critic-algorithm.mp4 | 14.1 MB |
| 90 | 949.5 KB |
| 02_efficiency-of-dynamic-programming.mp4 | 14.0 MB |
| 91 | 989.1 KB |
| 03_solving-the-blackjack-example.mp4 | 13.9 MB |
| 92 | 94.6 KB |
| 02_the-true-objective-for-td.mp4 | 13.7 MB |
| 93 | 351.8 KB |
| 01_estimating-the-policy-gradient.mp4 | 13.6 MB |
| 94 | 375.8 KB |
| 01_the-objective-for-learning-policies.mp4 | 13.4 MB |
| 95 | 660.6 KB |
| 03_policy-evaluation-vs-control.mp4 | 13.3 MB |
| 96 | 698.2 KB |
| 01_initial-project-meeting-with-martha-formalizing-the-problem.mp4 | 13.3 MB |
| 97 | 766.6 KB |
| 02_optimistic-initial-values.mp4 | 13.1 MB |
| 98 | 894.4 KB |
| 04_generalization-and-discrimination.mp4 | 12.9 MB |
| 99 | 144.1 KB |
| 01_epsilon-soft-policies.mp4 | 12.7 MB |
| 100 | 319.1 KB |
| 01_continuing-tasks.mp4 | 12.7 MB |
| 101 | 338.3 KB |
| 01_meeting-with-adam-getting-the-agent-details-right.mp4 | 12.6 MB |
| 102 | 410.3 KB |
| 03_off-policy-monte-carlo-prediction.mp4 | 12.5 MB |
| 103 | 496.3 KB |
| 01_flexibility-of-the-policy-iteration-framework.mp4 | 12.4 MB |
| 104 | 569.9 KB |
| 03_markov-decision-processes.mp4 | 12.4 MB |
| 105 | 659.1 KB |
| 01_lets-review-markov-decision-processes.mp4 | 12.4 MB |
| 106 | 659.1 KB |
| 04_examples-of-mdps.mp4 | 12.2 MB |
| 107 | 815.0 KB |
| 04_week-3-summary.mp4 | 11.9 MB |
| 108 | 55.3 KB |
| 02_in-depth-with-changing-environments.mp4 | 11.9 MB |
| 109 | 58.5 KB |
| 04_jonathan-langford-contextual-bandits-for-real-world-reinforcement-learning.mp4 | 11.9 MB |
| 110 | 65.8 KB |
| 02_why-bellman-equations.mp4 | 11.9 MB |
| 111 | 131.8 KB |
| 03_upper-confidence-bound-ucb-action-selection.mp4 | 11.8 MB |
| 112 | 234.1 KB |
| 02_comparing-td-and-monte-carlo-with-state-aggregation.mp4 | 11.5 MB |
| 113 | 466.2 KB |
| 01_meeting-with-adam-parameter-studies-in-rl.mp4 | 11.5 MB |
| 114 | 525.1 KB |
| 03_what-is-a-model.mp4 | 11.3 MB |
| 115 | 685.6 KB |
| 01_course-introduction.mp4 | 11.3 MB |
| 116 | 751.5 KB |
| 02_the-dyna-algorithm.mp4 | 11.2 MB |
| 117 | 780.5 KB |
| 01_congratulations.mp4 | 11.2 MB |
| 118 | 842.2 KB |
| 01_exploration-under-function-approximation.mp4 | 11.0 MB |
| 119 | 977.5 KB |
| 01_meeting-with-martha-discussing-your-results.mp4 | 11.0 MB |
| 120 | 48.0 KB |
| 01_the-value-error-objective.mp4 | 10.9 MB |
| 121 | 142.9 KB |
| 03_lets-review-dyna-q-learning-in-a-simple-maze.mp4 | 10.8 MB |
| 122 | 248.9 KB |
| 03_dyna-q-learning-in-a-simple-maze.mp4 | 10.8 MB |
| 123 | 248.9 KB |
| 05_framing-value-estimation-as-supervised-learning.mp4 | 10.7 MB |
| 124 | 315.9 KB |
| 03_what-is-temporal-difference-td-learning.mp4 | 10.3 MB |
| 125 | 709.2 KB |
| 02_optimal-value-functions.mp4 | 10.2 MB |
| 126 | 825.3 KB |
| 01_policy-improvement.mp4 | 10.0 MB |
| 127 | 8.2 KB |
| 04_week-4-summary.mp4 | 10.0 MB |
| 128 | 42.0 KB |
| 03_how-is-q-learning-off-policy.mp4 | 10.0 MB |
| 129 | 43.9 KB |
| 01_the-linear-td-update.mp4 | 9.9 MB |
| 130 | 102.2 KB |
| 01_lets-review-comparing-td-and-monte-carlo.mp4 | 9.8 MB |
| 131 | 192.9 KB |
| 02_comparing-td-and-monte-carlo.mp4 | 9.8 MB |
| 132 | 192.9 KB |
| 05_week-4-summary.mp4 | 9.6 MB |
| 133 | 396.8 KB |
| 01_the-dyna-architecture.mp4 | 9.6 MB |
| 134 | 415.3 KB |
| 05_week-1-summary.mp4 | 9.6 MB |
| 135 | 416.4 KB |
| 03_coarse-coding.mp4 | 9.6 MB |
| 136 | 422.8 KB |
| 02_non-linear-approximation-with-neural-networks.mp4 | 9.6 MB |
| 137 | 422.9 KB |
| 01_lets-review-non-linear-approximation-with-neural-networks.mp4 | 9.6 MB |
| 138 | 422.9 KB |
| 05_week-1-summary.mp4 | 9.5 MB |
| 139 | 534.7 KB |
| 02_the-policy-gradient-theorem.mp4 | 9.3 MB |
| 140 | 703.8 KB |
| 02_lets-review-examples-of-episodic-and-continuing-tasks.mp4 | 9.1 MB |
| 141 | 880.6 KB |
| 02_examples-of-episodic-and-continuing-tasks.mp4 | 9.1 MB |
| 142 | 880.6 KB |
| 01_the-advantages-of-temporal-difference-learning.mp4 | 9.1 MB |
| 143 | 926.6 KB |
| 03_week-3-review.mp4 | 8.9 MB |
| 144 | 125.9 KB |
| 04_week-2-review.mp4 | 8.5 MB |
| 145 | 513.3 KB |
| 01_the-goal-of-reinforcement-learning.mp4 | 8.0 MB |
| 146 | 1004.9 KB |
| 01_meeting-with-niko-choosing-the-learning-algorithm.mp4 | 7.9 MB |
| 147 | 122.9 KB |
| 02_lets-review-what-is-q-learning.mp4 | 7.8 MB |
| 148 | 166.6 KB |
| 01_what-is-q-learning.mp4 | 7.8 MB |
| 149 | 166.6 KB |
| 01_random-tabular-q-planning.mp4 | 7.8 MB |
| 150 | 177.8 KB |
| 02_course-wrap-up.mp4 | 7.8 MB |
| 151 | 246.6 KB |
| 01_what-if-the-model-is-inaccurate.mp4 | 7.7 MB |
| 152 | 315.9 KB |
| 05_expected-sarsa-with-function-approximation.mp4 | 7.6 MB |
| 153 | 379.7 KB |
| 02_lets-review-expected-sarsa-with-function-approximation.mp4 | 7.6 MB |
| 154 | 379.7 KB |
| 05_tile-coding.mp4 | 7.6 MB |
| 155 | 442.9 KB |
| 02_importance-sampling.mp4 | 7.4 MB |
| 156 | 602.7 KB |
| 04_week-2-summary.mp4 | 7.4 MB |
| 157 | 607.4 KB |
| 03_sarsa-gpi-with-td.mp4 | 7.4 MB |
| 158 | 633.7 KB |
| 02_q-learning-in-the-windy-grid-world.mp4 | 7.2 MB |
| 159 | 779.6 KB |
| 01_what-is-a-neural-network.mp4 | 7.0 MB |
| 160 | 997.8 KB |
| 04_comparing-sample-and-distribution-models.mp4 | 6.6 MB |
| 161 | 363.2 KB |
| 01_using-monte-carlo-for-action-values.mp4 | 6.5 MB |
| 162 | 544.5 KB |
| 01_expected-sarsa.mp4 | 6.3 MB |
| 163 | 752.7 KB |
| 01_lets-review-expected-sarsa.mp4 | 6.3 MB |
| 164 | 752.7 KB |
| 04_sarsa-in-the-windy-grid-world.mp4 | 5.9 MB |
| 165 | 152.1 KB |
| 02_expected-sarsa-in-the-cliff-world.mp4 | 5.7 MB |
| 166 | 318.6 KB |
| 03_week-2-summary.mp4 | 5.4 MB |
| 167 | 592.6 KB |
| 03_generality-of-expected-sarsa.mp4 | 5.2 MB |
| 168 | 805.6 KB |
| 02_using-monte-carlo-methods-for-generalized-policy-iteration.mp4 | 5.2 MB |
| 169 | 847.6 KB |
| 01_congratulations.mp4 | 4.4 MB |
| 170 | 658.2 KB |
| 04_week-4-summary.mp4 | 4.3 MB |
| 171 | 764.3 KB |
| 04_week-3-summary.mp4 | 3.7 MB |
Name
DL
Uploader
Size
S/L
Added
-
85.3 MB
[14
/
0]
2023-10-23
| Uploaded by freecoursewb | Size 85.3 MB | Health [ 14 /0 ] | Added 2023-10-23 |
-
21.9 MB
[4
/
16]
2025-06-03
| Uploaded by freecoursewb | Size 21.9 MB | Health [ 4 /16 ] | Added 2025-06-03 |
-
63.3 MB
[9
/
3]
2023-07-01
| Uploaded by FreeCourseWeb | Size 63.3 MB | Health [ 9 /3 ] | Added 2023-07-01 |
-
10.3 MB
[0
/
4]
2023-07-01
| Uploaded by FreeCourseWeb | Size 10.3 MB | Health [ 0 /4 ] | Added 2023-07-01 |
-
34.1 MB
[29
/
0]
2024-07-17
| Uploaded by freecoursewb | Size 34.1 MB | Health [ 29 /0 ] | Added 2024-07-17 |
NOTE
SOURCE: Reinforcement Learning Specialization
-----------------------------------------------------------------------------------
COVER

-----------------------------------------------------------------------------------
MEDIAINFO
None
×


