
Closed
Posted
Paid on delivery
Function 1: void MDP::computeQValue(State& s, const int action) This function is called by another function - valueIteration(), which you need to implement as well. As the function name implies “computeQValue”, you need to update the Q value of the state s based on the action it takes. Every state has four Q values corresponding to the actions “East”, “South”, “West”, and “North”. If the input action is “South”, you need to update the corresponding Q value of s for the “South” action. The input: (1) States &s - actually this is both the input and output parameter for the function. Initially, you use this input s to determine what are the potential next states based on the location of s and second input parameter “action”. The data structure of “State” is defined in the “MDP.h” file. (2) int action - the action taken from state s to next state. There are four different actions: East, South, West, North. The data type of them are integers, which are defined in the beginning of the MDP.h as macros: #define ACTION_EAST 0 #define ACTION_SOUTH 1 #define ACTION_WEST 2 #define ACTION_NORTH 3 Attention: (a) Please use the above order of actions for the “q_values” in State s. There are four Q values for each state, which can be stored in an array. So the value for each array member has the same order as the four actions. For example, q_values[2] corresponds to the action: ACTION_WEST; q_values[1] corresponds to the action: ACTION_SOUTH. Function 2: void MDP::valueIteration() Though it is called “valueIteration”, there is no iteration involved here. The iteration process is controlled by the “onGo()” function in the “VisualDisplay” class, which is the response function for the “play" button. Because, we want to see the temporary result of each iteration. So I have to move the code for the iteration progress to the GUI part. In fact, to some extent, it simplifies your work on the “valueIteration()” implementation. You do not need to use loop to do any iteration. You just assume when this function is called, it is at a particular iteration. You just need to update the “state_value” of all the states, which are stored as the class data member: "states[3][4]". When you do the state value updating, you need to call the “computeQValue(State &s, int action)” function to compute q_values[4] first and choose the maximum one as the state value. One thing you need to pay attention, is there are three special squares or states: wall (1, 1), diamond (1, 3), and pitfall (2, 3). For these three states, you don’t need to update their state values or q_values. Some other information: There are several important variables you need to use when you implement the above two functions. These variables are defined in the macro (on the top) of the “MDP.h” file. (1) “TRANSITION_SUCCEED” - the conditional probability of successfully reaching the next state as expected. For example, when you take the action north, you have 80% probability arrives at the north. (2) “TRANSITION_FAIL” - as the opposite part of “TRANSITION_SUCCEED”, you have 20% of probability that fails to land as expected. You may arrive at the state along the neighboring directions of the action with 10% for each. For example, when you take the action north, you may have 10% landing the west, and 10% landing the east. (3) “GAMMA” - the discount factor. (4) “ACTION_REWARD” - the instantaneous reward for each action. (5) “CONVERGENCE” - this variable is used by the VisualDisplay class which will determine when the iteration stops. Requirements: (1) Successful compilation and code building (2) Successfully compute the Q-value for each state (3) Successfully update the cur_convergence variable (4) Successfully compute the state value for each state Submission: - The source code. like "[login to view URL]". If you modify other files or create additional files, plese submit them as well.
Project ID: 34228078
6 proposals
Remote project
Active 4 yrs ago
Set your budget and timeframe
Get paid for your work
Outline your proposal
It's free to sign up and bid on jobs
6 freelancers are bidding on average ₹3,133 INR for this job

Hello, I have more than 7 years experience in C++ and I have previous experience with similar problems before that I solved using python. I assure you timely and accurate delivery within budget. Let us discuss more about milestones and project on chat. Regards,
₹4,000 INR in 4 days
3.1
3.1

I there, i am professional C++ developer with over 3 years of experience. I have read your details i can do handle your project easily. I have participated in many competitive programming competitions as well. I can deliver your projects in most efficient and timely manner. I have done many similar projects in the past. Lets discuss details in the chat. Waiting to hear from you
₹2,800 INR in 1 day
1.6
1.6

Dear Client, I have 7+ years experience in the same field even in onshore organizations, i have done over 200+ projects and quite understand your requirements as well. i can assure you the best quality of work with 100% satisfaction and guarantee with in your budget and time given. if you are kind enough lets initiate the chat and i can assure you to deliver the best quality with unlimited revisions as i am looking for some good reviews and feed back. Thanks.
₹600 INR in 2 days
1.5
1.5

I can gurantree for good product. Hey I'm interested in your project, I have read out your requirements. We have 5+ year experience. We have worked on similar projects to What You are looking for. We Have A Variety of IT Services. Custom Software Development, Qualified Staff to Develop and Customize Your Software. Give us a Call or WhatsApp +91 9430764087
₹3,800 INR in 7 days
0.0
0.0

I can gurantree for good product. Hey I'm interested in your project, I have read out your requirements. We have 5+ year experience. We have worked on similar projects to What You are looking for. We Have A Variety of IT Services. Custom Software Development, Qualified Staff to Develop and Customize Your Software. Give us a Call or WhatsApp +91 9430764087
₹3,800 INR in 7 days
0.0
0.0

New Delhi, India
Payment method verified
Member since Oct 19, 2019
₹1500-12500 INR
₹1000-2000 INR
₹600-4500 INR
₹600-3000 INR
₹600-4000 INR
£20-250 GBP
$15-25 USD / hour
$15-25 USD / hour
$250-750 USD
€250-750 EUR
min $50 USD / hour
$50-750 NZD
$250-750 USD
$250-750 USD
$250-750 USD
$1500-3000 USD
₹12500-37500 INR
$600-1200 USD
£20-250 GBP
$10-30 USD
$10-30 USD
₹12500-37500 INR
$15-25 USD / hour
$20-30 SGD / hour
£250-750 GBP