riodohee

'전체 글'에 해당되는 글 162건

네트워크 Day3 (포톤네트워크 PUN2 - 동기화) (2)	2018.12.13
네트워크 Day2 (포톤네트워크 PUN2 - 네트워크 개념, 포톤네트워크 사용) (1)	2018.12.12
네트워크 Day1 (포톤네트워크 PUN2 - 테스트) (0)	2018.12.10
장치 연결 테스트 (0)	2018.12.09
L-system: 생명체 시스템 (0)	2018.11.19

mlagent 기본적인 환경 구축하기

mlAgent 2018. 11. 18. 02:39

오늘은 목표물을 향해서 바닥에 떨어지지 않고 추적하는 교육을 시키도록 하겠습니다.

우선 기본적인 환경을 만들기 위한 세팅을 합니다. (저번 포스트에서 봤듯이 ML-Agent폴더를 가져옵니다.

패키지도 가져옵니다.)

1. 빈 오브젝트를 만들어주시고 이름은 Academy라고 짓습니다.

2. 빈 오브젝트를 하나 더 만들고 Brain이라고 이름을 지은 후에 Academy에 child시킵니다.

3. floor을 만듭니다. plane 오브젝트로 만들며 위치와 로테이션 값을 모두 0으로 하시고 스케일은 1로 합니다.(스케일은 변경하면 안됩니다. 환경을 고정적으로 에이전트의 기능만 사용할 것이기 때문.)

4. target도 만들어줍니다. target의 크기나 위치는 상관 없습니다.

5. 마지막으로 agent를 만들어줍니다.

agent는 sphere 게임오브젝트로 만들고 rigidbody component를 추가해줍니다.

스크립트를 만듭니다.

1. academy script

2. agent script

1. academy script

1-1. using mlagent; 추가

1-2 academy 상속

1
2
3
4
5
6
7
8
9
using System.Collections;
using System.Collections.Generic;
using UnityEngine;
using MLAgents;
 
public class RollerAcademy : Academy {
 
}
 
Colored by Color Scripter
cs

2. agent script

2-1. reset 값 설정

2-2. 환경 관찰

모든 값을 /5로 나누어주는 이유는 현재 환경 plane의 칸 수가 10칸이기 때문에

5로 나누어주는 것이다. (normalizing)

2-3. 리워드

2-4. brain으로부터 지시 받는 함수

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
using System.Collections;
using System.Collections.Generic;
using UnityEngine;
using MLAgents;
 
public class RollerAgent : Agent
{
    Rigidbody rBody;
    private void Start()
    {
        rBody = GetComponent<Rigidbody>();
    }
    public Transform Target;
 
    //에이전트가 리셋될 때 초기값
    public override void AgentReset()
    {
        if(this.transform.position.y < -1.0) 
        {
            //agent가 떨어지면
            this.transform.position = Vector3.zero;
            this.rBody.angularVelocity = Vector3.zero;
            this.rBody.velocity = Vector3.zero;
        }
        else
        {
            //target을 새로운 지점으로 이동 
            Target.position = new Vector3(Random.value * 8 - 4,
                                          0.5f,
                                          Random.value * 8 - 4);
        }
    }
 
    //에이전트가 환경을 관찰하는 함수
    public override void CollectObservations()
    {
        //타겟과 에이전트의 상대적 위치 
        Vector3 relativePosition = Target.position - this.transform.position;
 
        //상대적 위치
        AddVectorObs(relativePosition.x / 5);
        AddVectorObs(relativePosition.z / 5);
 
        //바닥과 에이전트 사이의 거리
        AddVectorObs((this.transform.position.x + 5) / 5);
        AddVectorObs((this.transform.position.x - 5) / 5);
        AddVectorObs((this.transform.position.z + 5) / 5);
        AddVectorObs((this.transform.position.z - 5) / 5);
 
        //에이전트의 속도
        AddVectorObs(rBody.velocity.x / 5);
        AddVectorObs(rBody.velocity.z / 5);
    }
 
    //brain으로부터 지시를 받는 함수
    public float speed = 10;
    private float previousDistance = float.MaxValue;
 
    public override void AgentAction(float[] vectorAction, string textAction)
    {
        // Rewards
        float distanceToTarget = Vector3.Distance(this.transform.position,
                                                  Target.position);
 
        // Reached target
        if (distanceToTarget < 1.42f)
        {
            AddReward(1.0f);
            Done();
        }
 
        // Time penalty
        AddReward(-0.05f);
 ₩
        if(_collision.obs == true)
        {
            AddReward(-0.05f);
        }
 
        // Fell off platform
        if (this.transform.position.y < -1.0)
        {
            AddReward(-1.0f);
            Done();
        }
 
        // Actions, size = 2 
        // brain으로부터 받는 지시를 수행하는 
        Vector3 controlSignal = Vector3.zero;
        controlSignal.x = vectorAction[0];
        controlSignal.z = vectorAction[1];
        rBody.AddForce(controlSignal * speed);
    }
 
}
 
 
 
Colored by Color Scripter
cs

작성한 스크립트들을 각각 이름에 맞게 배치한다.

에이전트 스크립트의 경우 inspector창에서 추가 작업이 필요하다.

1. brain게임오브젝트에 스크립트를 넣은 후에 Brain에 끌어다 놓는다.

2. target으로 설정한 오브젝트를 target으로 가져온다.

brain 스크립트를 ml-agent폴더에서 가져와서

brain 게임 오브젝트에 넣어주고

아래와 같이 세팅값을 맞춘다.

이 후에 테스트를 위하여 player를 사용해보자

player를 사용하기 위해서는 input값에 대한 키 세팅이 필요하다.

brain에서 brain mode를 player로 설정하고 아래와 같이 세팅을 맞춘다.

그러면 이제 플레이를 할 수 있다.

본인은 개인적으로 obstacle을 만들어서 obstacle에도 부정적 보상을 주어 obstacle을 피하면서 target을 추적하도록 교육시켰다. 그런데 실질적으로 교육성과는 시간이 지남에 따라서 별로 크게 변화하지 않았다... 보상값을 더 조절해야 될 듯.... 하지만, 미세하게 좀 더 빨리 찾는 것 같기도 하였다. 아래 영상은 brain mode를 external로 둔 것이다.

사용하는 방법은 이전 포스팅에 기록되어있다.

보다시피 교육에 눈에 띄는 성과가 없이 계속 뒤죽박죽하다.

결론. 머신러닝은 자동적인 움직임 관찰이 흥미로움.

생각보다 원하는 목표를 위해서는 보상체계에 대한 정교한 가이드가 필요한 듯.

저작자표시 비영리 변경금지

'mlAgent' 카테고리의 다른 글

3d ball 환경 분석하기 + tensorBoard사용법 (0)	2018.11.14
ML-Agents Toolkit 개요 (0)	2018.11.11
유니티 mlAgent 설치하기 (version. v0.5 ) - 맥북 (0)	2018.11.08

Posted by 도이(doi)

best free unity asset

data/information 2018. 11. 17. 21:52

https://zhuanlan.zhihu.com/p/34455368

저작자표시 비영리 변경금지

'data > information' 카테고리의 다른 글

벡터 (0)	2020.01.24
기본용어 - 블로그 (0)	2019.03.17
모두의 연구소 (0)	2019.02.13
티스토리 링크 미리보기 (0)	2019.01.16
unity noise tutorial (0)	2018.11.17

Posted by 도이(doi)

Mathf

programming/c# 2018. 11. 17. 14:59

Abs	f의 절대 값을 구합니다.
Acos	f의 호 - 코사인을 반환합니다. 코사인이 f 인 각도를 라디안 단위로 반환합니다.
Approximately	두 개의 부동 소수점 값을 비교하여 유사하면 true를 반환합니다.
Asin	Returns the arc-sine of f - the angle in radians whose sine is f.
Atan	Returns the arc-tangent of f - the angle in radians whose tangent is f.
Atan2	Returns the angle in radians whose Tan is y/x.
Ceil	f보다 크거나 같은 가장 작은 정수를 반환합니다. (올림) - 정수는 float형태
CeilToInt	f보다 크거나 같은 가장 작은 정수를 반환합니다. (올림) - 정수는 int형태
Clamp	최소 float 값과 최대 float 값 사이의 값을 return합니다.
Clamp01	Clamps value between 0 and 1 and returns value.
ClosestPowerOfTwo	Returns the closest power of two value.
CorrelatedColorTemperatureToRGB	Convert a color temperature in Kelvin to RGB color.
Cos	Returns the cosine of angle f.
DeltaAngle	Calculates the shortest difference between two given angles given in degrees.
Exp	Returns e raised to the specified power.
Floor	f보다 작거나 같은 가장 큰 정수를 구합니다. (내림)
FloorToInt	Returns the largest integer smaller to or equal to f.
GammaToLinearSpace	Converts the given value from gamma (sRGB) to linear color space.
InverseLerp	Calculates the linear parameter t that produces the interpolant value within the range [a, b].
IsPowerOfTwo	Returns true if the value is power of two.
Lerp	선형 적으로 a와 b 사이를 t로 보간합니다.
LerpAngle	Lerp와 동일하지만 값이 360 도로 랩 할 때 값이 올바르게 삽입되는지 확인합니다.
LerpUnclamped	Linearly interpolates between a and b by t with no limit to t.
LinearToGammaSpace	주어진 값을 선형에서 감마 (sRGB) 색 공간으로 변환합니다.
Log	Returns the logarithm of a specified number in a specified base.
Log10	Returns the base 10 logarithm of a specified number.
Max	둘 이상의 값 중 가장 큰 값을 반환합니다. (최대값 판별)
Min	둘 이상의 값 중 가장 작은 값을 반환합니다. (최솟값 판별)
MoveTowards	값을 타겟쪽으로 이동합니다.
MoveTowardsAngle	Same as MoveTowards but makes sure the values interpolate correctly when they wrap around 360 degrees.
NextPowerOfTwo	Returns the next power of two value.
PerlinNoise	2D Perlin 노이즈를 생성합니다.
PingPong	PingPongs는 길이 t보다 크지 않고 절대 0보다 작지 않도록 값 t를가집니다.
Pow	Returns f raised to power p.
Repeat	Loops the value t, so that it is never larger than length and never smaller than 0.
Round	가장 가까운 정수로 반올림 한 f를 반환합니다. (반올림)
RoundToInt	가장 가까운 정수로 반올림 한 f를 반환합니다. (반올림)
Sign	Returns the sign of f.
Sin	Returns the sine of angle f.
SmoothDamp	시간이 지남에 따라 원하는 목표를 향해 값을 점차적으로 변경합니다.
SmoothDampAngle	시간이 지남에 따라 원하는 목표 각도를 향해도 단위로 주어진 각도를 점차적으로 변경합니다.
SmoothStep	한계에서 스무딩을 사용하여 최소값과 최대 값 사이에 보간합니다.
Sqrt	Returns square root of f.
Tan	Returns the tangent of angle f in radians.

저작자표시 비영리 변경금지

'programming > c#' 카테고리의 다른 글

유니티 C# and Shader Study <Basics 01 _ MathMatics Surface> static, enum, delegate (0)	2019.12.17
유니티 C# and Shader Study <Basics 01 _ Building a Graph> 지렁이 만들기 (0)	2019.12.17
유니티 C# and Shader Study <Basics 01 _ GameObjects and Scripts> (0)	2019.12.16
object pooling + singleton (0)	2018.11.17
2018.11.13 - metaball(marching cubes)/GetVertx, try catch(예외처리), anim random(재생위치 랜덤) (0)	2018.11.14

Posted by 도이(doi)

이전 1 ··· 35 36 37 38 39 40 41 다음

riodohee

'전체 글'에 해당되는 글 162건

working flow

'Project > Galapagos' 카테고리의 다른 글

mlagent 기본적인 환경 구축하기

'mlAgent' 카테고리의 다른 글

best free unity asset

'data > information' 카테고리의 다른 글

Mathf

'programming > c#' 카테고리의 다른 글

카테고리

태그목록

최근에 올라온 글

최근에 달린 댓글

글 보관함

달력

링크

티스토리툴바