1.沼尾正行日文簡報.ppt [相容模式] · •...

人工知能とユーザ、聴衆のコラボ

による楽曲生成

沼尾正行

大阪大学産業科学研究所

自己紹介

• 東京工業大学工学部電気電子工学科(1982)– Z80, 6800, コマンド言語の研究

• 東京工業大学理工学研究科情報工学専攻

博士課程(1987)– 機械学習の研究

• 東京工業大学工学部情報工学科勤務

– 説明に基づく学習、帰納論理プログラミング

• スタンフォード大学CSLI (1989)• 大阪大学産業科学研究所(2003)

人工知能のフロンティア

• 人の感情の扱いと小説や音楽などの創造行為

• 創作の種

– 創作物を受容した人の反応

– 感情測定

– 新たな創造物に対する人の反応

• 今日の講演: 音楽を対象とした

– 感情の測定技術、予測技術

– 自動作曲

構成的適応インタフェースの研究

山へ行こう次の日曜

昔みたいに雨が降れば川底に

Constructive Adaptive User Interface (CAUI)

CAUI 研究の変遷

• 物まねによる編曲 [1993]• 音楽の認知モデル [1994,1996]• 認知モデルに基づく編曲 [1994]

– 編曲の質は 1999～2001年に飛躍的改善.

• 認知モデルに基づく作曲 [2001]• センサーデータに基づく作曲 [2007]• 各種センサーの利用 [2012]• フォークデュオとのコラボ [2016]

• 生体信号に基づいたモデルを利用した、

• 活性化のための編作曲

により、人間力活性化に寄与する。

音楽による活性化の目標

生体信号

楽曲評価

生体信号に基づいたモデル

活性化のための編作曲

∆

∆

(→ 活性化を引き起こす)

(→ 活性化する)


生体信号に基づいた活性化

曲

人

感情モデル作曲用RB(ルールベース)

感情

脳波

（周波数f／フラクタクル次元FD）

・感情モデルの分類

・感情モデルと作曲用RB

同時作成

人

脳波

モデルの分類

初期状態を判別

感情

モデル

活性化作曲

作曲用RB

同時作成

・革新的ポイント

脳情報をベースに感情モデルと合致した作曲RBの同時形成

音楽による活性化の研究開発 ⇨ ブレインミュージック

モデル化

活性化

なりたい気分

ブレインミュージック

生体信号に基づくモデル

8

属性抽出分類

感情

モデル

データ

解析

曲進行

モデル

生体信号

注釈付与

感情の連続的評価

ユーザ自身の評価に基づくセンシングの機械学習

新規性:

活性化のための自動編作曲

感情モデルと作曲ルールベースを活

用: ユーザの介入は最小限

活性化を主眼とする作曲

・・・・・・

モチーフ1 モチーフ2

モチーフ

N

2N 小節の和音進行

共生進化手法により、曲構成の複

数レベルで目標達成

楽曲生成手順

10

枠組構造

RB機械

学習

遺伝

アルゴ

リズム

モチーフ

RB

和音

進行

和音進行

RB

MIDI生成器

枠組

構造

メロディ

生成

器

楽曲

生成

器

ベース

パート

共生

進化

高揚曲

正例負例

沈鬱曲

本研究の革新的部分

作曲用RB

学習の結果得られる作曲用ルールの例

tune(M) :-has_frame(M,A),measure_four_four(A), has_chords(M,X,Y,Z),chord_II(X),inversion_Zero(Y),chord_I(Z),form_VII(Z).

• あるユーザが『美しい』と感じる楽曲構造のルール

楽曲構造

以下の枠組み構造を持つ

•拍子：４分の４拍子

以下の３連続和音を持つ

•第一和音根音：Ⅱ•第二和音転回：転回なし

•第三和音根音：Ⅰ形体：七の和音

作曲用ルールベース (RB) 左のルールの意味

音楽の認知モデル

SD法の結果予測モデル

• 目的

– 被験者の感性モデルを獲得

– 他の曲に対する印象を予測

• システム

– 和音列を解析.– SD法による印象測定.– 上の二つを訓練例として学習.– 別の曲の印象を予測.

2016/11/18

データ収集

被験者 7

訓練曲 75

形容詞対 6 対 =12 語

評価 5段階

編曲 36曲

= 3 曲×12 語

認知モデルの例

• 被験者A 明るい

– frame(S) :-

tonality_dur(Ｓ),tempo_allegro(Ｓ),basic_key_e(Ｓ),null.

– triplet(C1, C2, C3) :-

dur(C3), inversion_Zero(C3),chord_I(C1),

chord_I(C2),chord_II(C3),inversion_Zero(C1),form_VII(C2), null.

C1 C2 C3

Listener’s Evaluation of Musical Scores

Cannot capture the dynamic natureof both music and emotion

previous research

Impose heavy cognitive load upon the listener.

autonomic nervous system

Brain Waves

semantic differential method

OUTLINE

Musicalscores

GA

fitnessfunction

MelodyGeneration

musictheory

Background knowledge

Music

evaluation

relationsmodel

Affect ⇔music

InductiveLearning

FOIL + Rx

SubjectEvaluation

感情の認識: 特徴抽出

17

• Feature Extraction– フラクタル次元(FD: Fractal Dimension)値

脳波の複雑度を表す

– 強度スペクトル密度 (PSD: Power Spectral Density)• 各周波数帯の平均値

• スライド窓

– (重なり 0%)• 3 つのタイプの曲データ

– 既知曲 (Familiar song)– 未知曲 (Unfamiliar song)– 全ての曲 (既知 + 未知曲)

低 FD – 高 FD

感情の認識: 電極

18

• 学習で用いる特徴の構築

– 追加した特徴 (左右の差)• Fp1-Fp2• F3-F4• C3-C4• Fz-Pz• F7-F8• T3-T4

– 脳波と評価の対応付け: 多数決

感情の認識: 結果

19

覚醒度

Arousal

感情価

Valence

FD値使用, 各学習手法別) 周波数帯使用, 各学習手法別学習前

既知曲

未知曲

両方

FD値使用, 各学習手法別周波数帯使用, 各学習手法別

既知曲

未知曲

両方

学習前

学習後

学習後

獲得されたモデルに基づく編曲

音楽の印象を変化させる。

• 目的

– 得られたモデルを用いて, 意図した印象を与える

和音列を生成。

• 実験

– よく知られた75曲=クラシック: 36 + 日本のヒット

曲(J-POP): 39について認知モデルを獲得する。

– 3曲をそれぞれの印象に編曲する。

– 心理実験で結果を評価する。

2016/11/18

Second Eval.

Evaluation Phase

Arrangedpiece

認知モデル

DataGathering

曲 ILP

編曲の流れ

被

験

者

Training Piece

First Eval.

楽曲の特徴と評価

Arranger

ArrangedPiece

編曲結果

の評価

1

2

3

4

5

明るさ安心度嗜好度嬉しさ美しさせつなさ

+

元の曲

-

被験者Aの

好みの曲

+

-

元の曲

有意水準５％有り有り無し無し無し有り正方

向有意水準１％

有り無し無し無し無し有り

有意水準５％有り有り有り無し有り無し負方

向有意水準１％

有り無し有り無し有り無し

評

価

値

形容詞対(編曲目標)

Why does subject A like it?

• Chord progression and a key– A technique often used in J-POP music

• Transposition• Change in 4th and 5th of a chord

• Instruments

モデルに基づく作曲

新しい曲の生成

• 目的

– モデルを用いて, 意図する印象を与える和音列と

メロディーを生成する。

• 実験

– よく知られた75曲=クラシック: 36 + 日本のヒット

曲(J-POP): 39について認知モデルを獲得する。

– それに基づいて新しい曲を生成する。

– 心理実験で結果を評価する。

GA

認知モデル

和声理論

FitnessILP

被験者訓練曲

規則集合

作曲の流れ

evaluation attributes

旋律生成器

MACS [Tsunoda’96]新曲

和音進行+frame

メロディ

背景知識

和音列

全曲に対

する被験

者の評価

1

2

3

4

5

評価

明るさ安心度嬉しさ美しさ嗜好度せつなさ

形容詞対

正方向

負方向

有意水

準５％

有意

差あり

有意

差あり

有意

差あり

有意

差なし

有意

差なし

有意

差あり

有意水

準１％

有意

差あり

有意

差なし

有意

差なし

有意

差なし

有意

差なし

有意

差あり

Student’st検定

OUTLINE

Musicalscores

GA

fitnessfunction

MelodyGeneration

musictheory


Music

evaluation

relationsmodel

Affect ⇔music

InductiveLearning

FOIL + Rx

SubjectEvaluation

First-Order Logic Knowledge Representation

� Provides comprehensibility of learning results

� Capable of handling complex relational data

Basic music theory

Sonata [ Beethoven ]Neapolitan 6th, sus4, absentFunction

2, 4, 6, double dominant, absentSpecial variations

tonic, dominant, subdominantSecondary dominant

present, absentSemi-own

zero, 1, 2, 3Inversion

5th, 7th, 9thForm

I, II, III, IV, V, VI, VIIRoot

dur (major), moll (minor)Tonality

As, B, Cis, D, Fis, G, A, HKey

Chord features

complete (or authentic), half, otherCadence

dur (major), moll (minor)Tonality

As, B, Cis, D, Fis, G, A, HKey

piano, guitarAccompaniment

piano, sax ( soprano, tenor)Melodic instrument

2/2, 2/4, 4/4, 6/8Rhythm

allegretto, allegro, presto, larghetto

adagio, lento, andante, moderato,

Tempo

Frame features

OUTLINE

Musicalscores

GA

fitnessfunction

MelodyGeneration

musictheory


Music

evaluation

relationsmodel

Affect ⇔music

InductiveLearning

FOIL + Rx

SubjectEvaluation

Machine Learning Framework• The CAUI employs FOIL and Rx to model the musical

structures that correlate with the listener’s emotions with the musical structures consisting the set of training examples.

• FOIL is a top-down inductive logic programming heuristic function.

• Rx is a system that automatically refines the theory in the function-free first-order logic generated by FOIL.

TrainingExamples Rx Algorithm

FOIL Algorithm

First-orderTheory

+

OUTLINE

Musicalscores

GA

fitnessfunction

MelodyGeneration

musictheory


Music

evaluation

relationsmodel

Affect ⇔music

InductiveLearning

FOIL + Rx

SubjectEvaluation

2016/11/18

First-Order Logic Knowledge Representation

• Rules describing musical structures that can trigger the targeted emotional impression are generated

• CAUI learns three kinds of target rules, namely, frame, pair and triplet.

Model (Sad)

OUTLINE

Musicalscores

GA

fitnessfunction

MelodyGeneration

musictheory


Music

evaluation

relationsmodel

Affect ⇔music

InductiveLearning

FOIL + Rx

SubjectEvaluation

Genetic Algorithm for Structure Reproduction

• A GA that uses for its fitness function the model of user affect appears only in the CAUI.

• GA chromosome and operators

C1S C2 C3 C4 C5 C6 C7 C8

C1S C2 C3 C4 C5 C6 C7 C8

C5 C6 C7 C8C1S C2 C3 C4

C1S C2 C3 C4 C5 C6 C7 C8

Crossover

Mutation

Candidate chromosomes

C5 C6 C7 C8C1S C2 C3 C4

C1S C2 C3 C4 C5 C6 C7 C8

C5 C6 C7 C8C1S C2 C3 C4

C1S C2 C3 C4 C5 C6 C7 C8

C1S C2 C3 C4 C5 C6 C7 C8

C1S C2 C3 C4 C5 C6 C7 C8

( ) ( ) ( ) ( ) ( )( )∑=

=L

iiFRiFRiFiF PPPPAverageMfitnessX

1

'' ,,, δδδδ ( ) ( )( )∑=

=L

iiPAverageMfitnessX

1

η

Genetic Algorithm Fitness Function

fitnessChromosome(M) = fitnessUser(M) + fitnessTheory(M)

fitnessFrame + fitnessPair + fitnessTriplet(M)

fitnessX Pi (component/s of M) L Music Theory

fitnessFrame song_frame 1 frame heuristic

fitnessPair ( chord i ,chord i+1 ) n-1 pair heuristic

fitnessTriplet ( chord i chord i+1, chord i+2 ) n-2 triplet heuristictripletn-2( chord i, chord i+1, chord i+2 )fitnessTriplet

pairn-1( chord i,chord i+1 )fitnessPair

frame1song_framefitnessFrame

Target relationLPi (component/s of M)fitnessX

共生進化の特徴

• 解を部分解の組合せで表現

• 部分解集団と全体解集団の並行進化

集団の多様性の維持可能

– 局所解への収束を回避

– 早い世代での

適切な解への到達

部分解集団

全体解集団

36

Experimentation and Validation• Data gathering

– 16 subjects rated the 14 musical scores• All male, age range is from 18 to 27 (mean: 23)• No formal music education

– Personalized models were learned for each subject • 8 models are created.

– New musical pieces were composed independently• 3 musical pieces are composed per model.

– The same subjects evaluated the user-specific composed pieces

• The significance of the composed tunes were measured by paired t-testing the obtained emotion readings.

Experimental Result～Paired T test

0

0.5

1

1.5

2

2.5

3

t-value

Stress Joy Sad Relax

emotion

Result of Paired T-test

まとめ

• テーラーメードの感情モデルを構築

• 各ユーザに即した作曲用ルールベース構築の手法を

確立

• これまでに、「生体信号に基づいたモデル」作成と「活

性化のための編作曲」をそれぞれ実現している。

生体信号

楽曲評価

生体信号に基づいたモデル

活性化のための編作曲

∆

∆

(→ 活性化を引き起こす)



生体信号に基づいた活性化

Acknowledgement• Systems are developed in my lab, and

Prof. Otani’s lab in Tokyo City University.• Contributors:

– Shoichi Takagi– Tetsuo Takata– Roberto Legaspi– Rafael Cabredo– Takayuki Nishikawa– Toshihito Sugimoto– Noriko Otani– and others…

1.沼尾正行日文簡報.ppt [相容模式] · •...

Documents

Transcript of 1.沼尾正行日文簡報.ppt [相容模式] · •...

1.沼尾正行 日文簡報.ppt [相容模式] · •...

Documents

Transcript of 1.沼尾正行 日文簡報.ppt [相容模式] · •...

1.沼尾正行日文簡報.ppt [相容模式] · •...

Transcript of 1.沼尾正行日文簡報.ppt [相容模式] · •...